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FUSION PROTEINS 0¥ MYCOBACTERIUM 
TUBERCULOSIS ANTIGENS AND THEIR USES 



5 1. INTRODUCTION 

The present invention relates to fusion proteins containing at least two 
Mycobacterium tuberculosis mtigem. In particular, it relates to bi- fusion proteins which 
contain two individual M. tuberculosis antigens, tri- fusion proteins which contain three M. 
10 tuberculosis antigens, tetra-fiision proteins which contain four M. tuberculosis antigens, and 
penta-fusion proteins which contain five M. tuberculosis antigens, and methods for their use 
in the diagnosis, treatinent and prevention of tuberculosis infection. 

2. BACKGROUND OF THE I NVENTION 

15 

Tuberculosis is a chronic infectious disease caused by infection witii M 
tuberculosis. It is a major disease in developing countries, as well as an increasing problem 
in developed areas of the world, with about 8 million new cases and 3 million deaths each 
year. Although the infection may be asymptomatic for a considerable period of time, the 
20 disease is most commonly manifested as an acute inflammation of the lungs, resulting in 
fever and a nonproductive cough. If untireated, serious complications and deatii typically 
result. 

Altiiough tuberculosis can generally be controlled usmg extended antibiotic therapy, 
such treatinent is not sufficient to prevent ttie spread of the disease. Infected individuals 
25 may be asymptomatic, but contagious, for some time. In addition, although compliance 
witii tiie treatinent regimen is critical, patient behavior is difficult to monitor. Some patients 
do not complete the course of treatment, which can lead to ineffective treatinent and tiie 
development of drug resistance. 

In order to control tiie spread of hiberculosis, effective vaccination and accurate 
30 early diagnosis of tiie disease are of utinost importance. Currently, vaccination with live 
bacteria is tiie most efBcient metiiod for inducing protective immunity. The most common 
Mycobacterium employed for tiiis purpose is Bacillus Calmette-Guerin (BCG), an avirulent 
strain of M bovis. However, tiie safety and efticacy of BCG is a source of contiroversy and 
some countries, such as tiie United States, do not vaccinate tiie general public witii tiiis 
35 agent. 
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Diagnosis of tuberculosis is commonly achieved using a skin test, which involves 
intradennal exposure to tuberculin PPD (protein-purified derivative). Antigen-specific T 
cell responses result in measurable mduration at the injection site by 48-72 hours after 
injection, which indicates exposure to Mycobacterial antigens. Sensitivity and specificity 

5 have, however, been a problem with this test, and individuals vaccinated with BCG cannot 
be distinguished firom infected individuals. 

While macrophages have been shown to act as the principal effectors of M 
tuberculosis immunity, T cells are the predominant inducers of such immunity. The 
essential role of T cells m protection against M. tuberculosis infection is illustrated by the 

10 Sequent occurrence of M tuberculosis in Acquired Immunodeficiency Syndrome patients, 
due to the depletion of CD4^ T cells associated with human inmiunodeficiency virus (HIV) 
infection. Mycobacterium-reactive CD4^ T cells have been shown to be potent producers of 
gamma-interferon (IFN-y), which, in turn, has been shown to trigger the anti-mycobacterial 
effects of macrophages in mice. While the role of IFN-y in humans is less clear, studies 

15 have shown that 1,25-dihydroxy-vitamin D3, either alone or in combination with IFN-y or 
tumor necrosis factor-alpha, activates human macrophages to inhibit M. tuberculosis 
infection. Furthemiore, it is known that IFN-y stimulates human macrophages to make 
1,25-dihydroxy-vitamin D3. Similarly, interleukin-12 (IL-12) has been shown to play a role 
in stimulating resistance to M tuberculosis infection. For a review of the immunology of 

20 M tuberculosis infection, see Chan and Kaufinann, 1994, Tuberculosis: Pathogenesis. 
Protection and Control, Bloom (ed.), ASM Press, Washington, DC. 

Accordingly, there is a need for improved vaccines, and improved methods for 
diagnosis, preventing and treating tuberculosis. 

25 3, SUMMARY OF THF. INVENTION 

The present invention relates to fusion proteins of M tuberculosis antigens. In 
particular, it relates to fiision polypeptides that contain two or more M. tuberculosis 
antigens, polynucleotides encoding such polypeptides, methods of using the polypeptides 
30 and polynucleotides in the diagnosis, treatment and prevention of M tuberculosis infection. 
The present invention is based, in part, on the inventors' discovery that 
polynucleotides which contain two to five M tuberculosis coding sequences produce 
recombinant fi;ision proteins that retain the immunogenicity and antigenicity of their 
individual components. The fusion proteins described herein induced both T cell and B cell 
35 responses, as measured by T cell proliferation, cytokine production, and antibody 
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production. Furthennore, a fusion protein was used as an immunogen with adjuvants in 
vivo to elicit both cell-mediated and humoral immunity to M. tuberculosis. Additionally, a 
fusion protein was made by a fusion construct and used in a vaccine formulation with an 
adjuvant to afford long-term protection in animals against the development of tuberculosis. 
5 The fusion protein was a more effective immunogen than a mixture of its individual protein 
components. 

In a specific embodiment of the invention, the isolated or purified M tuberculosis 
polypeptides of the invention may be formulated as pharmaceutical compositions for 
administration into a subject m the prevention and/or treatment of M tuberculosis infection. 
10 The immunogenicity of the fusion protein may be enhanced by the inclusion of an 
adjuvant. 

In another aspect of the invention, the isolated or purified polynucleotides are used 
to produce recombinant fusion polypeptide antigens in vitro. Alternatively, the 
polynucleotides may be administered directly into a subject as DNA vaccines to cause . 
15 antigen expression in the subject, and the subsequent induction of an anti-M tuberculosis 
immune response. 

It is also an object of the invention that the polypeptides be used in in vitro assays 
for detecting humoral antibodies or cell-mediated immunity against M. tuberculosis for 
diagnosis of infection or monitor of disease progression. Additionally, the polypeptides 
20 may be used as an in vivo diagnostic agent in the form of an intradermal skin test. 
Alternatively, the polypeptides may be used as immunogens to generate anti-M 
tuberculosis antibodies in a non-human animal. The antibodies can be used to detect the 
target antigens in vivo and in vitro. 

25 

4. BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 A and IB. The nucleotide sequence (SEQ ID N0:1) and amino acid 
sequence (SEQ ID N0:2) of tri-fusion protein Ral2-TbH9- 
30 Ra35 (designated Mtb32A). 

Figure 2 : The nucleotide sequence (SEQ ID N0:3) and amino acid 
sequence (SEQ ID N0:4) of tri-fiision protein Erdl4-DPV- 
MTI (designated Mtb39A). 
Figure 3A- 3D: The nucleotide sequence (SEQ ID N0:5) and amino acid 
35 sequence (SEQ ID N0:6) of tri-fiision protein TbRa3-38kD- 
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Tb38-1. 

Figure 4A - 4D: The nucleotide sequence (SEQ ID N0:7) and amino acid 

sequence (SEQ ID N0:8) of bi-fusion protein TbH9-Tb38-l. 
Figure 5A - 5J: The nucleotide sequence (SEQ ID N0:9) and amino acid 

sequence (SEQ ID NO: 10) of tetra-fusion protein TbRa3- 

38kD-Tb38-l-DPEP (designated TbF-2). 
Figure 6A and 6B: The nucleotide sequence (SEQ ID NO: 11) and amino acid 

sequence (SEQ ID N0:12) of penta-fusion protein Erdl4- 

DPV-MTI-MSL-MTCC2 (designated Mtb88f). 
Figure 7A and 7B: The nucleotide sequence (SEQ ID NO: 1 3) and amino acid 

sequence (SEQ ID NO:14) of tetra-fusion protein Erdl4- 

DPV-MTI-MSL (designated Mtb46f). 
Figure 8A and 8B: The nucleotide sequence (SEQ ID NO: 1 5) and amino acid 

sequence (SEQ ID N0:16) of tetra-fusion protein DPV-MTI- 

MSL-MTCC2 (designated MtbTlf). 
Figure 9A and 9B : The nucleotide sequence (SEQ ID NO: 1 7) and amino acid 

sequence (SEQ ID NO: 18) of tri-fusion protein DPV-MTI- 

MSL (designated Mtb31f). 
Figure lOA and lOB: The nucleotide sequence (SEQ ID NO:19) and amino acid 

sequence (SEQ ID NO:20) of tri-fusion protein TbH9-DPV- 

MTI (designated Mtb61f). 
Figure 11 A and 1 IB: The nucleotide sequence (SEQ ID NO:2l) and amino acid 

sequence (SEQ ID NO:22) of tri-fusion protein Erdl4-DPV- 

MTI (designated Mtb36f). 
Figure 12A and 12B: The nucleotide sequence (SEQ ID NO:23) and amino acid 

sequence (SEQ ID NO:24) of bi-fusion protein TbH9-Ra35 

(designated Mtb59f). 
Figure 13A and 13B: The nucleotide sequence (SEQ ID NO:25) and amino acid 

sequence (SEQ ID NO:26) of bi-fixsion protein Ral2-DPPD 

(designated Mtb24). 
Figure 14A-14F: T cell proliferation responses of six PPD+ subjects when 

stimulated with two fusion proteins and their individual 

components. 

Figure 15A-15F: IFN-y production of six PPD+ subjects when stimulated with 
two fusion proteins and their individual components. 
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Figure 16A-16F: T cell proliferation of mice immunized with a fusion protein 

or its individual components and an adjuvant. 
Figure 17: IFN-y production of mice immunized with a fusion protein or 

its individual components and an adjuvant. 
5 Figure 18: il-4 production ofmice immunized with a fusion protein or 

its individual components and an adjuvant. 
Figure 19A-19F: Serum antibody concentrations ofmice immunized with a 

fusion protein or its individual components and an adjuvant. 
Figure 20A-20C: Survival of guinea pigs after aerosol challenge of M 
10 tuberculosis. Fusion proteins, Mtb32A and Mtb39A, were 

formulated in adjuvant SBASlc (20A), SBAS2 (20B) or 
SBAS7 (20C), and used as an immunogen in guinea pigs 
prior to challenge with bacteria. BCG is the positive control. 
Figure 21 A and 21B: Stimulation of proliferation and IFN-y production in TbH9- 
1 5 specific T cells by the fusion protein TbH9-Tb38-l . 

Figure 22A and 22B: Stimulation of proliferation and IFN-y production in Tb38-1- 

specific T cells by the fusion protein TbH9-Tb38-l . 
Figure 23A and 23B: Stimulation of proUferation and IFN-y production in T cells 
previously shown to respond to both TbH-9 and Tb38-1 
20 antigens by the fusion protein TbH9-Tb38-l. 

5. nF.TAfl.ED DFSrRIPTION THE INVENTION 

The present invention relates to antigens useful for the treatment and prevention of 
25 tuberculosis, polynucleotides encoding such antigens, and methods for their use. The 
antigens of the present invention are fusion polypeptides of M tuberculosis antigens and 
variants thereof More specifically, the antigens of the present invention comprise at least 
two polypeptides of M. tuberculosis that are fiased into a larger fiision polypeptide 
molecule. The antigens of the present invention may further comprise other components 
30 designed to enhance the immunogenicity of the antigens or to improve these antigens in 
other aspects, for example, the isolation of these antigens through addition of a stretch of 
histidine residues at one end of the antigen. 

5.1. M ry/iiJ?gCt/£Ayi:y SPECIFIC ANTIGENS 
35 The antigens of the present invention are exemplified in Figures 1 A through 13B, 
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including homologues and variants of those antigens. These antigens may be modified, for 
example, by adding linker peptide sequences as described below. These linker peptides 
may be inserted between one or more polypeptides which make up each of the fusion 
proteins presented in Figures 1 A through 13B. Other antigens of the present invention are 
5 antigens described in Figures lA through 13B which have been linked to a known antigen 
of M tuberculosis, such as the previously described 38 kD (SEQ ID NO:27) antigen 
(Andersen and Hansen,1989, Infect. Immun. 57:2481-2488; Genbank Accession No. 
M30046). 

10 

5.2. IMMUNOGENICITY ASSAYS 

Antigens described herein, and immunogenic portions thereof, have the ability to 
induce an immunogenic response. More specifically, the antigens have the ability to induce 
proUferation and/or cytokine production (i.e., interferon-y and/or interleukin-12 production) 

15 in T cells, NK cells, B cells and/or macrophages derived from an M tuberculosisAmmmic 
individual. The selection of cell type for use in evaluating an immunogenic response to a 
antigen will depend on the desired response. For example, interleukin-12 production is 
most readily evaluated using preparations containing B cells and/or macrophages. An 
M. tuberculosis'immxme individual is one who is considered to be resistant to the 

20 development of tuberculosis by virtue of having mounted an effective T cell response to 
M tuberculosis (i.e., substantially firee of disease symptoms). Such individuals may be 
identified based on a strongly positive (/.e., greater than about 10 mm diameter indiu^tion) 
intradennal skin test response to tuberculosis proteins (PPD) and an absence of any signs or 
symptoms of tuberculosis disease. T cells, NK cells, B cells and macrophages derived fi-om 

25 M tuberculosis-immmc individuals may be prepared using methods known to those of 
ordinary skill in the art. For example, a preparation of PBMCs (i.e., peripheral blood 
mononuclear cells) may be employed without further separation of component cells. 
PBMCs may generally be prepared, for example, using density centrifugation through 
"FICOLL" (Winthrop Laboratories, NY). T cells for use in the assays described herein may 

30 also be purified directly firom PBMCs. Alternatively, an enriched T cell line reactive 
against mycobacterial proteins, or T cell clones reactive to individual mycobacterial 
protems, may be employed. Such T cell clones may be generated by, for example, culturing 
PBMCs firom M tuberculosis-nnmmic individuals with mycobacterial proteins for a period 
of 2-4 weeks. This allows expansion of only the mycobacterial protein-specific T cells, 

35 resulting in a line composed solely of such cells. These cells may then be cloned and tested 
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with individual proteins, using methods known to those of ordinary skill in the art, to more 
accurately define individual T cell specificity. In general, antigens that test positive in 
assays for proliferation and/or cytokine production interferon-y and/or interleukin-12 
production) performed using T cells, NK cells, B cells and/or macrophages derived &om an 
5 M. tuberculosis-immme individual are considered immunogenic. Such assays may be 
performed, for example, using the representative procedures described below. 
Immunogenic portions of such antigens may be identified using similar assays, and may be 
present within the polypeptides described herein. 

The ability of a polypeptide (e-g-., an immunogenic antigen, or a portion or other 
10 variant thereof) to induce cell proliferation is evaluated by contacting the cells (e.g., T cells 
and/or NK cells) with the polypeptide and measuring the proliferation of the cells. In 
general, the amount of polypeptide that is sufficient for evaluation of about 10* cells ranges 
fi-om about 10 ng/mL to about 100 ng/mL and preferably is about 10 ^g/mL. The 
incubation of polypeptide with cells is typically performed at 37°C for about six days. 
15 Following incubation with polypeptide, the cells are assayed for a proliferative response, 
which may be evaluated by methods known to those of ordinary skill in the art, such as 
exposing cells to a pulse of radiolabeled thymidine and measuring the incorporation of label 
into cellular DNA. In general, a polypeptide that results in at least a three fold increase in 
proHferation above background (i.e., the proHferation observed for cells cultured without 
20 polypeptide) is considered to be able to induce proHferation. 

The ability of a polypeptide to stimulate the production of interferon-y and/or 
interleukin-12 in cells may be evaluated by contacting tiie cells with the polypeptide and 
measuring the level of interferon-y or interleukin-12 produced by the cells. In general, the 
amount of polypeptide that is sufficient for tiie evaluation of about 10* cells ranges fi-om 
25 about 10 ng/mL to about 100 ng/mL and preferably is about 10 ^ig/mL. The polypeptide 
may be, but need not be, immobilized on a solid support, such as a bead or a biodegradable 
microsphere, such as those described in U.S. Patent Nos. 4,897,268 and 5,075,1 09. The 
incubation of polypeptide with the ceUs is typically performed at 37°C for about six days. 
Following mcubation with polypeptide, the cells are assayed for interferon-y and/or 
30 interleukin-12 (or one or more subunits thereof), which may be evaluated by methods 
known to tiiose of ordinary skill in tiie art, such as an enzyme-linked immunosorbent assay 
(ELISA) or, in the case of IL-12 P70 subunit, a bioassay such as an assay measuring 
proliferation of T cells. In general, a polypeptide that results in the production of at least 
50 pg of interferon-Y per mL of culhired supernatant (containing lO'-lO* T cells per mL) is 
35 considered able to stimulate the production of interferon-y. A polypeptide that stimulates 
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the production of at least 10 pg/mL of IL-12 P70 subunit, and/or at least 100 pg/mL of 
IL-12 P40 subunit, per 10^ macrophages or B cells (or per 3 x 10* PBMC) is considered able 
to stimulate the production of IL-12. 

In general, immunogenic antigens are those antigens that stimulate proliferation 
5 and/or cytokine production (i.e., interferon-y and/or interleukin-12 production) in T cells, 
NK cells, B cells and/or macrophages derived from at least about 25% of M tuberculosis- 
immune individuals. Among these immunogenic antigens, polypeptides having superior 
therapeutic properties may be distinguished based on the magnitude of the responses in the 
above assays and based on the percentage of individuals for which a response is observed. 
10 In addition, antigens having superior therapeutic properties will not stimulate proUferation 
and/or cytokine production in vitro in cells derived from more than about 25% of 
individuals who are not M tuberculosis-katmne, thereby eliminating responses that are not 
specifically due to M tuberculosis-responsive cells. Those antigens that induce a response 
in a high percentage of T cell, NK cell, B cell and/or macrophage preparations from 
1 5 M tuberculosis-irrmme individuals (with a low incidence of responses in cell preparations 
from other individuals) have superior thenqjeutic properties. 

Antigens with superior therapeutic properties may also be identified based on their 
ability to diminish the severity of M. tuberculosis infection in experimental animals, when 
administered as a vaccine. Suitable vaccine preparations for use on experimental animals 
20 are described in detail below. Efficacy may be determined based on the ability of the 
antigen to provide at least about a 50% reduction in bacterial numbers and/or at least about 
a 40% decrease in mortality following experimental infection. Suitable experimental 
animals include mice, guinea pigs and primates. 



25 5.3. ISOT.ATION QF TOPING SE QUENCES 

The present invention also relates to nucleic acid molecules that encode fiision 
polypeptides of M tuberculosis. In a specific embodiment by way of example in Section 6, 
infra, thirteen M tuberculosis fusion coding sequences were constructed. In accordance 
with the invention, any nucleotide sequence which encodes the amino acid sequence of the 
30 fusion protein can be used to generate recombinant molecules which direct the expression 
of the coding sequence. 

In order to clone full-length coding sequences or homologous variants to generate 
the fiision polynucleotides, labeled DNA probes designed fixim any portion of the 
nucleotide sequences or their complements disclosed herein may be used to screen a 
35 genomic or cDNA library made from various strains of M tuberculosis to identify the 
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coding sequence of each individual component. Isolation of coding sequences may also be 
carried out by the polymerase chain reactions (PGR) using two degenerate oligonucleotide 
primer pools designed on the basis of the coding sequences disclosed herein. 

The invention also relates to isolated or purified polynucleotides complementary to 

5 the nucleotide sequences of SEQ ID N0S:1, 3, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 23 and 25, and 
polynucleotides that selectively hybridize to such complementary sequences. In a preferred 
embodiment, a polynucleotide which hybridizes to the sequence of SEQ ID N0S:1, 3, 5, 7, 
9, 1 1, 13, 15, 17, 19, 21, 23 and 25 or its complementary sequence under conditions of low 
stringency and encodes a protein that retains the immunogenicity of the fusion proteins of 

10 SEQ ID N0S:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24 and 26 is provided. By way of 

example and not limitation, exemplary conditions of low stringency are as follows (see also 
Shilo and Weinberg, 1981, Proc. Natl. Acad. Sci. USA 78:6789-6792): Filters containing 
DNA are pretreated for 6 h at 40^C in a solution containing 35% formamide, 5X SSC, 50 
mM Tris-HCl (pH 7.5), 5mM EDTA, 0.1% PVP, 0,1% FicoU, 1% BSA, and 500 ^g/m« 

15 denatured salmon sperm DNA. Hybridizations are carried out in the same solution with the 
following modifications: 0.02% PVP, 0.02% FicoU, 0.2% BSA, 100 |ig/m« salmon sperm 
DNA, 10% (wt/vol) dextran sulfate, and 5-20 X 10^ cpm ^^P-labeled probe is used. Filters 
are incubated in hybridization mixture for 18-20 h at 40 °C, and then washed for 1.5 h at 
55^*0 in a solution containing 2X SSC, 25 mM Tris-HCl (pH 7.4), 5 mM EDTA, and 0.1% 

20 SDS. The wash solution is replaced with fresh solution and incubated an additional 1 .5 h at 
60'*C. Filters are blotted dry and exposed for autoradiography. If necessary, filters are 
washed for a third time at 65-68 °C and re-exposed to film. Other conditions of low 
stringency which may be used are well known in the art (e.g., as employed for cross-species 
hybridizations). 

25 In another preferred embodiment, a polynucleotide which hybridizes to the coding 

sequence of SEQ ID NOS:l, 3, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 23 and 25 or its 
complementary sequence under conditions of high stringency and encodes a protein that 
retains the immunogenicity of the fixsion proteins of SEQ ID N0S:2, 4, 6, 8, 10, 12, 14, 16, 
18, 20, 22, 24 and 26 is provided. By way of example and not Umitation, exemplary 

30 conditions of high stringency are as follows: Prehybridization of filters containing DNA is 
carried out for 8 h to overnight at 65 °C in buffer composed of 6X SSC, 50 mM Tris-HCl 
(pH 7.5), 1 mM EDTA, 0.02% PVP, 0.02% FicoU, 0.02% BSA, and 500 fig/mL denatured 
sahnon sperm DNA. Filters are hybridized for 48 h at 65 ''C in prehybridization mixture 
containing 100 ^ig/mL denatured sahnon sperm DNA and 5-20 X 10^ cpm of ^^P-labeled 

35 probe. Washing of filters is done at 37°C for 1 h in a solution contaming 2X SSC, 0.01% 
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PVP, 0.01% FicoU, and 0.01% BSA. This is followed by a wash in O.IX SSC at 50°C for 
45 min before autoradiography. Other conditions of high stringency which may be used are 

well known in the art. 

In yet another preferred embodiment, a polynucleotide which hybridizes to the 

5 coding sequence of SEQ ID N0S:1, 3. 5, 7. 9. 1 1, 13, 15, 17, 19, 21, 23 and 25 or its 
complementary sequence under conditions of moderate stringency and encodes a protein 
that retains the immunogenicity of the fusion proteins of SEQ ID N0S:2, 4, 6, 8, 10, 12, 14. 
16, 18, 20, 22, 24 and 26 is provided. Exemplary conditions of moderate stringency are as 
follows: Filters containing DNA are pretreated for 6 h at 55 °C in a solution containing 6X 

10 SSC, 5X Denhart's solution, 0.5% SDS and 100 jig/mL denatured salmon sperm DNA. 
Hybridizations are carried out in the same solution and 5-20 X 10<^ cpm ^^P-labeled probe is 
used. Filters are incubated in hybridization mixture for 18-20 h at 55°C, and then washed 
twice for 30 minutes at 60°C in a solution containing IX SSC and 0.1% SDS. Filters are 
blotted dry and exposed for autoradiography. Other conditions of moderate stringency 

15 which may be used are well-known in the art. Washing of filters is done at 37''C for 1 h in 
a solution containing 2X SSC, 0.1% SDS. 

5.4. POI.YPEPTIDFS RNCODEn BY THE CP niNf: SFOUENCES 
In accordance with the invention, a polynucleotide of the invention which encodes a 
20 fusion protein, fragments thereof, or functional equivalents thereof may be used to generate 
recombinant nucleic acid molecules that direct the expression of tiie fusion protein, 
fragments thereof, or fimctional equivalents thereof, in appropriate host cells. The fusion 
polypeptide products encoded by such polynucleotides may be altered by molecular 
manipulation of the coding sequence. 
25 Due to tiie inherent degeneracy of the genetic code, otiier DNA sequences which 

encode substantially the same or a functionally equivalent amino acid sequence, may be 
used in the practice of the invention for the expression of the fusion polypeptides. Such 
DNA sequences include tiiose which are capable of hybridizing to tiie coding sequences or 
their complements disclosed herein under low, moderate or high stiingency conditions 
30 described in Sections 5.3, supra. 

Altered nucleotide sequences which may be used in accordance with the invention 
include deletions, additions or substitutions of different nucleotide residues resulting in a 
sequence that encodes tiie same or a functionally equivalent gene product. The gene 
product itself may contain deletions, additions or substitutions of amino acid residues, 
35 which result in a silent change thus producing a functionally equivalent antigenic epitope. 

10 
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Such conservative amino acid substitutions may be made on the basis of similarity in 
polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic nature of 
the residues involved. For example, negatively charged amino acids include aspartic acid 
and glutamic acid; positively charged amino acids include lysine, histidine and arginine; 

5 amino acids with uncharged polar head groups having similar hydrophilicity values include 
the following: glycine, asparagine, glutamine, serine, threonine and tyrosine; and amino 
acids with nonpolar head groups include alanine, valine, isoleucine, leucine, phenylalanine, 
proline, methionine and tryptophan. 

The nucleotide sequences of the invention may be engineered in order to alter the 

10 fusion protein coding sequence for a variety of ends, including but not limited to, alterations 
which modify processing and expression of the gene product. For example, mutations may 
be introduced using techniques which are well knovra in the art, e.g., site-directed 
mutagenesis, to insert new restriction sites, to alter glycosylation patterns, phosphorylation, 
etc. 

15 In an altemate embodiment of the invention, the coding sequence of a fusion protein 

could be synthesized in whole or in part, using chemical methods well known in the art. 
See, e.g., Caruthers et al, 1980, Nuc. Acids Res. Symp. Ser, 7:215-233; Crea and Hom, 180, 
Nuc. Acids Res. 9(10):233l; Matteucci and Caruthers, 1980, Tetrahedron Letter 27:719; and 
Chow and Kempe, 1981, M/c. Acids Res. 9(J2):2S01'2S17. Alternatively, the polypeptide 

20 itself could be produced using chemical methods to synthesize an amino acid sequence in 
whole or in part. For example, peptides can be synthesized by solid phase techniques, 
cleaved from the resin, and purified by preparative high performance liquid 
chromatography. (See Creighton, 1983, Proteins Structures And Molecular Principles, 
W.H. Freeman and Co., N.Y. pp. 50-60). The composition of the synthetic polypeptides 

25 may be confirmed by amino acid analysis or sequencing (e.g.y the Edman degradation 
procedure; see Creighton, 1983, Proteins, Structures and Molecular Principles y W.H. 
Freeman and Co., N.Y., pp. 34-49). 

Additionally, the coding sequence of a fusion protein can be mutated in vitro or in 
vivo, to create and/or destroy translation, initiation, and/or termination sequences, or to 

30 create variations in coding regions and/or form new restriction endonuclease sites or destroy 
preexisting ones, to facilitate further in vitro modification. Any technique for mutagenesis 
known in the art can be used, including but not limited to, chemical mutagenesis, in vitro 
site-directed mutagenesis (Hutchinson, C, et aL, 1978, J. Biol. Chem 253:6551), use of 
TAB® linkers (Pharmacia), and the like. It is important that the manipulations do not 

35 destroy immunogenicity of the fusion polypeptides. 
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In addition, nonclassical amino acids or chemical amino acid analogs can be 
introduced as a substitution or addition into the sequence. Non-classical amino acids 
include, but are not limited to, the D-isomers of the common amino acids, a-amino 
isobutyric acid, 4-aminobutyric acid, Abu, 2-amino butyric acid, y-Abu, €-Ahx, 6-amino 

5 hexanoic acid, Aib, 2-amino isobutyric acid, 3-amino propionic acid, ornithine, norleucine, 
norvaline, hydroxyproline, sarcosine, citruUine, cysteic acid, t-butylglycine, t-butylalanine, 
phenylglycine, cyclohexylalanine, p-alanine, fluoro-amino acids, designer amino acids such 
as p-methyl amino acids, Ca-methyl amino acids, Na-methyl amino acids, and amino acid 
analogs in general. Furthermore, the amino acid can be D (dextrorotary) or L (levorotary). 

10 In a specific embodiment, the coding sequences of each antigen in the fusion protein 

are joined at their amino- or carboxy-terminus via a peptide bond in any order 
Altematively, a peptide linker sequence may be employed to separate the individual 
polypeptides that make-up a fusion polypeptide by a distance sufficient to ensure that each 
polypeptide folds into a secondary and tertiary structure that maximizes its antigenic 

1 5 effectiveness for preventing and treating tuberculosis. Such a peptide linker sequence is 
incorporated into the fusion protein using standard techniques well known in the art. 
Suitable peptide linker sequences may be chosen based on the following factors: (1) their 
ability to adopt a flexible extended conformation; (2) their inabiHty to adopt a secondary 
structure that could interact with functional epitopes on the first and second polypeptides; 

20 and (3) the lack of hydrophobic or charged residues that might react with the polypeptide 
functional epitopes. Preferred peptide linker sequences contain Gly, Asn and Ser residues. 
Other near neutral amino acids, such as Thr and Ala may also be used in the linker 
sequence. Amino acid sequences which may be usefully employed as linkers include those 
disclosed in Maratea et al. Gene 40:39-46, 1985; Murphy et al, Proc. Natl. Acad. Sci. USA 

25 83:8258-8262, 1986; U.S. Patent No. 4,935,233 and U.S. Patent No. 4,751,180. The linker 
sequence may be fiiom 1 to about 50 amino acids in length. Peptide sequences are not 
required when the first and second polypeptides have non-essential N-terminal amino acid 
regions that can be used to separate the fimctional domains and prevent steric interference. 
For example, the antigens in a fusion protein may be connected by a flexible polylinker 

30 such as Gly-Cys-Gly or Gly-Gly-Gly-Gly-Ser repeated 1 to 3 times (Bird et al, 1988, 

Science 242:423-426; Chaudhary et al, 1990, Proc. Natl. Acad. Sci. U.S.A. 87:1066-1070). 

In one embodiment, such a protein is produced by recombinant expression of a 
nucleic acid encoding the protein. Such a fiision product can be made by ligating the 
appropriate nucleic acid sequences encoding the desired amino acid sequences to each other 

35 by methods known in the art, in the proper coding frame, and expressing the product by 
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methods known in the art. Alternatively, such a product may be made by protein synthetic 
techniques, e.g., by use of a peptide synthesizer. Coding sequences for other molecules 
such as a cytokine or an adjuvant can be added to the fusion polynucleotide as well. 



5 5.5. PRODUCTION OF FUSION PROTEINS 

In order to produce a M tuberculosis fusion protein of the invention, the nucleotide 
sequence coding for the protein, or a functional equivalent, is inserted into an appropriate 
expression vector, i.e., a vector which contains the necessary elements for the transcription 
and translation of the inserted coding sequence. The host cells or cell linies transfected or 

10 transformed with recombinant expression vectors can be used for a variety of purposes. 
These include, but are not limited to, large scale production of the fusion protein. 

Methods which are well known to those skilled in the art can be used to construct 
expression vectors containing a fusion coding sequence and appropriate 
transcriptional/translational control signals. These methods include in vitro recombinant 

15 DNA techniques, synthetic techniques and in vivo recombination/genetic recombination. 
{See, e.g., the techniques described in Sambrook et aL, 1989, Molecular Cloning A 
Laboratory Manual, Cold Spring Harbor Laboratory, N.Y. and Ausubel et aL, 1989, Current 
Protocols in Molecular Biology, Greene Publishing Associates and Wiley Interscience, 
N.Y.). RNA capable of encoding a polypeptide may also be chemically synthesized (Gait, 

20 erf., 1984, Oligonucleoide Synthesis, IRL Press, Oxford), 

5.5J. EXPRESSION SYSTEMS 

A variety of host-expression vector systems may be utilized to express a fusion 
protein coding sequence. These include, but are not limited to, microorganisms such as 

25 bacteria {e.g., E. coli, B. subtilis) transformed with recombinant bacteriophage DNA, 

plasmid DNA or cosmid DNA expression vectors containing a coding sequence; yeast (e.g., 
Saccharomycdes, Pichia) transformed with recombinant yeast expression vectors 
containing a coding sequence; insect cell systems infected with recombinant virus expres- 
sion vectors {e.g., baculovirus) containing a coding sequence; plant cell systems infected 

30 with recombinant virus expression vectors {e.g., cauliflower mosaic virus, CaMV; tobacco 
mosaic virus, TMV) or transformed with recombinant plasmid expression vectors {e.g., Ti 
plasmid) containing a coding sequence; or mammalian cell systems COS, CHO, BHK, 
293, 3T3 cells). The expression elements of these systems vary in their strength and 
specificities. 

35 Depending on the host/vector system utilized, any of a number of suitable 
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transcription and translation elements, including constitutive and inducible promoters, may 
be used in the expression vector. For example, when cloning in bacterial systems, inducible 
promoters such as pL of bacteriophage A, plac, ptrp, ptac (ptrp-Iac hybrid promoter; 
cytomegalovirus promoter) and the Uke may be used; when cloning in insect cell systems, 

5 promoters such as the baculovirus polyhedron promoter may be used; when cloning in plant 
cell systems, promoters derived jfrom the genome of plant cells (e.g., heat shock promoters; 
the promoter for the small subunit of RUBISCO; the promoter for the chlorophyll a/p 
binding protein) or from plant viruses the 35S RNA promoter of CaMV; the coat 
protein promoter of TMV) may be used; when cloning in mammaHan cell systems, 

10 promoters derived from the genome of mammalian cells (e.g., metallothionein promoter) or 
from mammalian viruses (e.g., the adenovirus late promoter; the vaccinia virus 7.5K 
promoter) may be used; when generating cell lines that contain multiple copies of a the 
antigen coding sequence, SV40-, BPV- and EB V-based vectors may be used with an 
appropriate selectable marker. 

15 Bacterial systems are preferred for the expression of M. tuberculosis antigens. For 

in vivo delivery, a bacterixmi such as Bacillus-Calmette-Guerrin may be engineered to 
express a fusion polypeptide of the invention on its cell surface. A number of other 
bacterial expression vectors may be advantageously selected depending upon the use 
intended for the expressed products. For example, when large quantities of the ftision 

20 protein are to be produced for formulation of pharmaceutical compositions, vectors which 
direct the expression of high levels of fusion protein products that are readily purified may 
be desirable. Such vectors include, but are not limited to, the E. coli expression vector 
pUR278 (Ruther et al, 1983, EMBO J. 2:1791), in which a coding sequence may be hgated 
into the vector in frame with the lacZ coding region so that a hybrid protein is produced; 

25 piN vectors (Inouye and Inouye, 1985, Nucleic Acids Res. 13:3101-3109; Van Heeke and 
Schuster, 1989, J. Biol. Chem. 264:5503-5509); and the like. pGEX vectors may also be 
used to express foreign polypeptides as fusion proteins with glutathione S-transferase 
(GST). In general, such fusion proteins are soluble and can be purified easily from lysed 
cells by adsorption to glutathione-agarose beads followed by elution in the presence of free 
30 glutathione. The pGEX vectors are designed to include thrombin or factor Xa protease 
cleavage sites so that the cloned fusion polypeptide of interest can be released from the GST 
moiety. 

5.5.2. PROTEIN PURIFICATION 

3^ Once a recombinant protein is expressed, it can be identified by assays based on the 
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physical or functional properties of the product, including radioactive labeling of the 
product followed by analysis by gel electrophoresis, radioimmunoassay, ELBA, bioassays, 
etc. 

Once the encoded protein is identified, it may be isolated and purified by standard 
5 methods including chromatography (e.g., high performance liquid chromatography, ion 
exchange, affinity, and sizing column chromatography), centrifiigation, differential 
solubility, or by any other standard technique for the purification of proteins. The actual 
conditions used will depend, in part, on factors such as net charge, hydrophobicity, 
hydrophilicity, etc., and will be apparent to those having skill in the art. The fimctional 
10 properties may be evaluated using any suitable assay such as antibody binding, induction of 
T cell proliferation, stimulation of cytokine production such as IL2, IL-4 and IFN-y, For 
the practice of the present invention, it is preferred that each fiision protein is at least 80% 
purified fi-om other proteins. It is more preferred that they are at least 90% purified. For in 
vivo administration, it is preferred that the proteins are greater than 95% piuified. 

15 

5.6, USES OF THE FUSION PROTEIN CODING SEQUENCE 
The fusion protein coding sequence of the invention may be used to encode a protein 
product for use as an immimogen to induce and/or enhance inunune responses to M 
tuberculosis. In addition, such coding sequence may be ligated with a coding sequence of 

20 another molecule such as cytokine or an adjuvant. Such polynucleotides may be used in 
vivo as a DNA vaccine (U.S. Patent Nos. 5,589,466; 5,679,647; 5,703,055), In this 
embodiment of the invention, the polynucleotide expresses its encoded protein in a recipient 
to directly induce an immune response. The polynucleotide may be injected into a naive 
subject to prime an immime response to its encoded product, or administered to an infected 

25 or immunized subject to enhance the secondary inmiune responses. 

In a preferred embodiment, a therapeutic composition comprises a fiision protein 
coding sequence or fiagments thereof that is part of an expression vector. In particular, 
such a polynucleotide contains a promoter operably linked to the coding region, said 
promoter being inducible or constitutive, and, optionally, tissue-specific. In another 

30 embodiment, a polynucleotide contains a coding sequence flanked by regions that promote 
homologous recombination at a desired site in the genome, thus providing for 
intrachromosomal expression of the coding sequence (Koller and Smithies, 1989, Proc. 
Nati. Acad. Sci. USA 86:8932-8935; Zijlstra et al, 1989, Nature 342:435-438). 

Delivery of the nucleic acid into a subject may be either direct, in which case the 

35 subject is directly exposed to the nucleic acid or nucleic acid-carrying vector, or indirect, in 
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which case, cells are first transformed with the nucleic acid in vitro, then transplanted into 
the subject. These two approaches are known, respectively, as in vivo or ex vivo gene 
transfer. 

In a specific embodiment, the nucleic acid is directly administered in vivo, where it 

5 is expressed to produce the encoded fiision protein product. This can be accompUshed by 
any of numerous methods known in the art, e.g., by constructing it as part of an appropriate 
nucleic acid expression vector and administering it so that it becomes intracellular, e.g., by 
infection using a defective or attenuated retroviral or other viral vector {see, U.S. Patent No. 
4,980,286), or by direct injection of naked DNA, or by use of microparticle bombardment 

^® {^'g-> a gene gun; Biolistic, Dupont), or coating with lipids or cell-surface receptors or 
transfecting agents, encapsulation in liposomes, microparticles, or microcapsules (United 
States Patent Nos. 5,407,609; 5,853,763; 5,814,344 and 5,820,883), or by administering it 
in linkage to a peptide which is known to enter the nucleus, by administering it in linkage 
to a ligand subject to receptor-mediated endocytosis {see, e,g., Wu and Wu, 1987, J. Biol. 

15 Chem. 262:4429-4432) which can be used to target cell types specifically expressing the 
receptors, etc. In another embodiment, a nucleic acid-ligand complex can be formed in 
which the ligand comprises a fiisogenic viral peptide to disrupt endosomes, allowing the 
nucleic acid to avoid lysosomal degradation. In yet another embodiment, the nucleic acid 
can be targeted in vivo for cell specific uptake and expression, by targeting a specific 

20 receptor {see, eg, PCT Publications WO 92/06180 dated April 16, 1992; WO 92/22635 
dated December 23, 1992; WO92/20316 dated November 26, 1992; W093/14188 dated 
July 22, 1993; WO 93/20221 dated October 14, 1993). Alternatively, the nucleic acid can be 
introduced intracellularly and incorporated within host cell DNA for expression, by 
homologous recombination (KoUer and Smithies, 1989, Proc. Natl Acad. Sci. USA 

25 86:8932-8935; Zijlstra et al, 1989, Nature 342:435-438). 

In a specific embodiment, a viral vector such as a retroviral vector can be used {see. 
Miller et al, 1993, Meth. Enzymol. 217:581-599). Retroviral vectors have been modified 
to delete retroviral sequences that are not necessary for packaging of the viral genome and 
integration into host cell DNA. A fixsion coding sequence is cloned into the vector, which 

30 facihtates delivery of the nucleic acid into a recipient. More detail about retroviral vectors 
can be found in Boesen et al, 1994, Biotherapy 6:291-302, which describes the use of a 
retroviral vector to deliver the mdrl gene to hematopoietic stem cells in order to make the 
stem cells more resistant to chemotherapy. Other references illustrating the use of retroviral 
vectors in gene therapy are: Clowes et al, 1994, J. Clin. Invest. 93:644-651; Kiem et al, 

35 1994, Blood 83:1467-1473; Sahnons and Gunzberg, 1993, Human Gene Therapy 4:129- 
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141; and Grossman and Wilson, 1993, Curr. Opin. in Genetics and Devel. 3:110-1 14. 

Adenovirases are other viral vectors that can be used in gene therapy. Adenoviruses 
are especially attractive vehicles for delivering genes to respiratory epithelia. Adenoviruses 
naturally infect respiratory epithelia where they cause a mild disease. Other targets for 

5 adenovirus-based delivery systems are liver, the central nervous system, endothelial cells, 
and muscle. Adenoviruses have the advantage of being capable of infecting non-dividing 
cells. Adeno-associated virus (AAV) has also been proposed for use in in vivo gene transfer 
(Walsh et al, 1993, Proc. Soc. Exp. Biol. Med. 204:289-300. 

Another approach involves transferring a construct to cells in tissue culture by such 

10 methods as electroporation, lipofection, calcium phosphate mediated transfection, or viral 
infection. Usually, the method of transfer includes the transfer of a selectable marker to the 
cells. The cells are then placed imder selection to isolate those cells that have taken up and 
are expressing the transferred gene. Those cells are then delivered to a subject. 

In this embodiment, the nucleic acid is introduced into a cell prior to administration 

15 zw v/vo of the resulting recombinant cell. Such introduction can be carried out by any 
method known in the art, including but not limited to transfection, electroporation, 
microinjection, infection with a viral or bacteriophage vector containing the nucleic acid 
sequences, ceD fusion, chromosome-mediated gene transfer, microcell-mediated gene 
transfer, spheroplast fusion, etc. Numerous techniques are known in the art for the 

20 introduction of foreign genes into cells (see e.^., Loeffler and Behr, 1993, Meth. Enzymol. 
217:599-618; Cohen et aL, 1993, Meth. EnzymoL 217:618-644; Cline, 1985, Pharmac. , 
Ther. 29:69-92) and may be used in accordance with the present invention. 

The polynucleotides of the invention may also be used in the diagnosis of 
tuberculosis for detection of polynucleotide sequences specific to M tuberculosis in a 

25 patient. Such detection may be accomplished, for example, by isolating polynucleotides 
from a biological sample obtained from a patient suspected of being infected with the 
bacteria. Upon isolation of polynucleotides from the biological sample, a labeled 
polynucleotide of the invention that is complementary to one or more of the 
polynucleotides, will be allowed to hybridize to polynucleotides in the biological sample 

30 using techniques of nucleic acid hybridization known to those of ordinary skill in the art. 
For example, such hybridization may be carried out in solution or with one hybridization 
partner on a solid support. 

5.7. THERAPEUTIC AND PROPHYLACTIC USES OF THE FUSION 
35 PROTEIN 
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Purified or partially purified fusion proteins or fragments thereof may be formulated 
as a vaccine or therapeutic composition. Such composition may include adjuvants to 
enhance immxme responses. In addition, such proteins may be further suspended in an oil 
emulsion to cause a slower release of the proteins in vivo upon injection. The optimal ratios 

5 of each component in the formulation may be determined by techniques well known to 
those skilled in the art. 

Any of a variety of adjuvants may be employed in the vaccines of this invention to 
enhance the immune response. Most adjuvants contain a substance designed to protect the 
antigen firom rapid cataboUsm, such as aluminum hydroxide or mineral oil, and a 

1 0 nonspecific stimulator of immune responses, such as lipid A, Bortadella pertussis or 
Mycobacterium tuberculosis. Suitable adjuvants are commercially available and include, 
for example, Freund's Incomplete Adjuvant and Frexmd's Complete Adjuvant (Difco 
Laboratories) and Merck Adjuvant 65 (Merck and Company, Inc., Rahway, NJ). Other 
suitable adjuvants include alum, biodegradable microspheres, monophosphoryl lipid A, 

15 quil A, SBASlc, SBAS2 (Ling et al, 1997, Vaccine 15:1562-1567), SBAS7 and Al(OH)3. 

In the vaccines of the present invention, it is preferred that the adjuvant induces an 
inunune response comprising Thl aspects. Suitable adjuvant systems include, for example, 
a combination of monophosphoryl lipid A, preferably 3-de-O-acylated monophosphoryl 
lipid A (3D-MLP) together with an aluminum salt. An enhanced system involves the 

20 combination of a monophosphoryl lipid A and a saponin derivative, particularly the 
combination of 3D-MLP and the saponin QS21 as disclosed in WO 94/00153, or a less 
reactogenic composition where the QS21 is quenched with cholesterol as disclosed in WO 
96/33739. Previous experiments have demonstrated a clear synergistic effect of 
combinations of 3D-MLP and QS21 in the induction of both humoral and Thl type cellular 

25 immune responses. A particularly potent adjuvant formation involving QS21, 3D-MLP and 
tocopherol in an oil-in-water emulsion is described in WO 95/17210 and is a preferred 
formulation. 

Formulations containing an antigen of the present invention may be administered to 
a subject per se or in the form of a pharmaceutical or therapeutic composition. 

30 Pharmaceutical compositions comprising the proteins may be manufactured by means of 
conventional mixing, dissolving, granulating, dragee-making, levigating, emulsifying, 
encapsulating, entrapping or lyophilizing processes. Pharmaceutical compositions may be 
formulated in conventional manner using one or more physiologically acceptable carriers, 
diluents, excipients or auxiliaries which facilitate processing of the polypeptides into 

35 preparations which can be used pharmaceutically. Proper formulation is dependent upon 
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the route of administration chosen. 

For topical administration, the proteins may be formulated as solutions, gels, 
ointments, creams, suspensions, etc. as are well-known in the art. 

Systemic formulations include those designed for administration by injection, eg. 
5 subcutaneous, intravenous, intramuscular, intrathecal or intraperitoneal injection, as well as 
those designed for transdermal, transmucosal, oral or pulmonary administration. 

For injection, the proteins may be formulated in aqueous solutions, preferably in 
physiologically compatible buffers such as Hanks's solution. Ringer's solution, or 
physiological saline buffer. The solution may contain formulatory agents such as 
10 suspending, stabilizing and/or dispersing agents. Alternatively, the proteins may be in 
powder form for constitution with a suitable vehicle, e.g., sterile pyrogen-free water, before 
use. 

For transmucosal administration, penetrants appropriate to the barrier to be 
permeated are used in the formulation. Such penetrants are generally known in the art. 
15 For oral administration, a composition can be readily formulated by combining the 

proteins with pharmaceutically acceptable carriers well known in the art. Such carriers 
enable the proteins to be formulated as tablets, pills, dragees, capsules, liquids, gels, syrups, 
slurries, suspensions and the like, for oral ingestion by a subject to be treated. For oral solid 
formulations such as, for example, powders, capsules and tablets, suitable excipients 

20 include fillers such as sugars, such as lactose, sucrose, mannitol and sorbitol; cellulose 

J* 

preparations such as maize starch, wheat starch, rice starch, potato starch, gelatin, gum 
tragacanth, methyl cellulose, hydroxypropylmethyl-cellulose, sodium 
carboxymethylcellulose, and/or polyvinylp>TroUdone (PVP); granulating agents; and 
bmding agents. If desired, disintegrating agents may be added, such as the cross-linked 
25 polyvinylpyrrolidone, agar, or alginic acid or a sah thereof such as sodium alginate* 

If desired, soUd dosage forms may be sugar-coated or enteric-coated using standard 
techniques. 

For oral liquid preparations such as, for example, suspensions, elixirs and solutions, 
suitable carriers, excipients or diluents include water, glycols, oils, alcohols, etc. 
30 Additionally, flavoring agents, preservatives, coloring agents and the like may be added. 

For buccal administration, the proteins may take the form of tablets, lozenges, etc. 
formulated in conventional manner. 

For administration by inhalation, the proteins for use according to the present 
invention are conveniently delivered in flie form of an aerosol spray fi-om pressurized packs 
35 or a nebulizer, with the use of a suitable propellant, eg., dichlorodifluoromethane, 
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trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide or other suitable gas. In 
the case of a pressurized aerosol the dosage unit may be determined by providing a valve to 
deliver a metered amount. Capsules and cartridges of, e.g.y gelatin for use in an inhaler or 
insufflator may be formulated containing a powder mix of the proteins and a suitable 

5 powder base such as lactose or starch. 

The proteins may also be formulated in rectal or vaginal compositions such as 
suppositories or retention enemas, e.g., containing conventional suppository bases such as 
cocoa butter or other glycerides. 

In addition to the formulations described previously, the proteins may also be 

10 formulated as a depot preparation. Such long acting fonnulations may be administered by 
implantation (for example subcutaneously or intramuscularly) or by intramuscular injection. 
Thus, for example, the proteins may be formulated with suitable polymeric or hydrophobic 
materials (for example as an emulsion in an acceptable oil) or ion exchange resins, or as 
sparingly soluble derivatives, for example, as a sparingly soluble salt. 

15 Alternatively, other pharmaceutical delivery systems may be employed. Liposomes 

and emulsions are well known examples of delivery vehicles that may be used to deliver an 
antigen. Certam organic solvents such as dimethylsulfoxide also may be employed, 
although usually at the cost of greater toxicity. The fusion proteins may also be 
encapsulated in microspheres (United States Patent Nos. 5,407,609; 5,853,763; 5,814,344 

20 and 5,820,883). Additionally, the proteins may be delivered using a sustained-release 
system, such as semipermeable matrices of solid polymers containing the therapeutic or 
vaccmating agent. Various sustained-release materials have been established and are well 
known by those skilled in the art. Sustained-release capsules may, depending on their 
chemical nature, release the proteins for a few weeks up to over 100 days. Depending on 

25 the chemical nature and the biological stability of the reagent, additional strategies for 
protein stabilization may be employed. 

Determination of an effective amount of the fusion protein for inducing an immune 
response in a subject is well within the capabilities of those skilled in the art, especially in 
Ught of the detailed disclosure provided herein. 

30 An effective dose can be estimated initially from in vitro assays. For example, a 

dose can be formulated in animal models to achieve an induction of an immxme response 
using techniques that are well known in the art. One having ordinary skill in the art could 
readily optimize administration to humans based on animal data. Dosage amoimt and 
interval may be adjusted individually. For example, when used as a vaccine, the 

35 polypeptides and/or polynucleotides of the invention may be administered in about 1 to 3 
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doses for a 1-36 week period. Preferably, 3 doses are administered, at intervals of about 3-4 
months, and booster vaccinations may be given periodically thereafter. Altemate protocols 
may be appropriate for individual patients. A suitable dose is an amotmt of polypeptide or 
DNA that, when administered as described above, is capable of raising an immune response 

5 in an immunized patient sufficient to protect the patient from M tuberculosis infection for 
at least 1-2 years. In general, the amount of polypeptide present in a dose (or produced in 
situ by the DNA in a dose) ranges from about 1 pg to about 100 mg per kg of host, typically 
from about 10 pg to about 1 mg, and preferably from about 100 pg to about 1 ng. Suitable 
dose range will vary with the size of the patient, but will typically range from about 0,1 mL 

10 to about 5 mL. 

5.8 DIAGNOSTIC USES OF THE FUSION PROTEIN 

The fusion polypeptides of the invention are usefiil in the diagnosis of tuberculosis 
infection in vitro and in vivo. The ability of a polypeptide of the invention to induce cell 
15 proliferation or cytokine production can be assayed by the methods disclosed in Section 5.2, 
supra. 

In another aspect, this invention provides methods for using one or more of the 
fusion polypeptides to diagnose tuberculosis using a skin test in vivo. As used herein, a skin 
test is any assay performed directly on a patient in which a delayed-type hypersensitivity 

20 (DTH) reaction (such as swelling, reddening or dermatitis) is measiu-ed following 

intradermal injection of one or more polypeptides as described above. Such injection may, 
be achieved using any suitable device sufficient to contact the polypeptide with dermal cells 
of the patient, such as, for example, a tuberculin syringe or 1 mL syringe. Preferably, the 
reaction is measured at least about 48 hoiu^ after injection, more preferably about 48 to 

25 about 72 hours after injection. 

The DTH reaction is a cell-mediated immune response, which is greater in patients 
that have been exposed previously to the test antigen (/.e., the immunogenic portion of the 
polypeptide employed, or a variant thereof). The response may be measured visually, using 
a ruler. In general, a response that is greater than about 0,5 cm in diameter, preferably 

30 greater than about LO cm in diameter, is a positive response, indicative of tuberculosis 
infection, which may or may not be manifested as an active disease. 

The fusion polypeptides of this invention are preferably formulated, for use in a skin 
test, as pharmaceutical compositions containing a polypeptide and a physiologically 
acceptable carrier. Such compositions typically contain one or more of the above 

35 polypeptides in an amount ranging from about 1 ng to about 100 jig, preferably from about 
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10 Jig to about 50 \ig in a volume of 0, 1 mL. Preferably, the carrier employed in such 
pharmaceutical compositions is a saline solution with appropriate preservatives, such as 
phenol and/or Tween 80™. 

In another aspect, the present invention provides methods for using the polypeptides 
5 to diagnose tuberculosis. In this aspect, methods are provided for detecting M tuberculosis 
infection in a biological sample using the fusion polypeptides alone or in combination. As 
used herein, a "biological sample" is any antibody-containing sample obtained from a 
patient Preferably, the sample is whole blood, sputum, serum, plasma, saliva cerebrospinal 
fluid or urine. More preferably, the sample is a blood, serum or plasma sample obtained 
10 from a patient or a blood supply. The polypeptide(s) are used in an assay, as described 
below, to determine the presence or absence of antibodies to the polypeptide(s) in the 
sample relative to a predetermined cut-off value. The presence of such antibodies indicates 
previous sensitization to mycobacterial antigens which may be indicative of tuberculosis. 
In embodiments in which more than one fusion polypeptide is employed, the 
15 polypeptides used are preferably complementary (i,e., one component polypeptide will tend 
to detect infection in samples where the infection would not be detected by another 
component polypeptide). Complementary polypeptides may generally be identified by using 
each polypeptide individually to evaluate serum samples obtained from a series of patients 
known to be infected with M tuberculosis. After determining which samples test positive 

20 (as described below) with each polypeptide, combinations of two or more fusion 

polypeptides may be formulated that are capable of detecting infection in most, or all, of the 
samples tested. Such polypeptides are complementary. Approximately 25-30% of sera from 
tuberculosis-infected individuals are negative for antibodies to any single protein. 
Complementary polypeptides may, therefore, be used in combination to improve sensitivity 

25 of a diagnostic test. 

There are a variety of assay formats known to those of ordinary skill in the art for 
using one or more polypeptides to detect antibodies in a sample. See, e.g., Harlow and Lane, 
Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory, 1988, which is 
incorporated herein by reference. In a preferred embodiment, the assay involves the use of 

30 polypeptide hnmobilized on a solid support to bind to and remove the antibody from the 
sample. The bound antibody may then be detected using a detection reagent that contains a 
reporter group. Suitable detection reagents include antibodies that bind to the 
antibody/polypeptide complex and free polypeptide labeled with a reporter group (e.g., in a 
semi-competitive assay). Alternatively, a competitive assay may be utilized, in which an 

35 antibody that binds to the polypeptide is labeled with a reporter group and allowed to bind 
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to the immobilized antigen after incubation of the antigen with the sample. The extent to 
which components of the sample inhibit the binding of the labeled antibody to the 
polypeptide is indicative of the reactivity of the sample with the immobilized polypeptide. 
The solid support may be any solid material known to those of ordinary skill in the 

5 art to which the antigen may be attached For example, the solid support may be a test well 
in a microtiter plate or a nitrocellulose or other suitable membrane. Altematively, the 
support may be a bead or disc, such as glass, fiberglass, latex or a plastic material such as 
polystyrene or polyvinylchloride. The support may also be a magnetic particle or a fiber 
optic sensor, such as those disclosed, for example, in U.S. Patent No. 5,359,681. 

10 The polypeptides may be bound to the solid support using a variety of techniques 

known to those of ordinary skill in the art. In the context of the present invention, the term 
"boimd" refers to both noncovalent association, such as adsorption, and covalent attachment 
(which may be a direct linkage between the antigen and functional groups on the support or 
may be a linkage by way of a cross-linking agent). Binding by adsorption to a well in a 

15 microtiter plate or to a membrane is preferred. In such cases, adsorption may be achieved by 
contacting the polypeptide, in a suitable buffer, with the solid support for a suitable amount 
of time. The contact time varies with temperature, but is typically between about 1 hour and 
1 day. In general, contacting a well of a plastic microtiter plate (such as polystyrene or 
polyvinylchloride) with an amount of polypeptide ranging from about 10 ng to about 1 yug, 

20 and preferably about 100 ng, is sufficient to bind an adequate amount of antigen. 

Covalent attachment of polypeptide to a solid support may generally be achieved byo 
first reacting the support with a bifimctional reagent that will react with both the support 
and a fimctional group, such as a hydroxyl or amino group, on the polypeptide. For 
example, the polypeptide may be bound to supports having an appropriate polymer coating 

25 using benzoquinone or by condensation of an aldehyde group on the support with an amine 
and an active hydrogen on the polypeptide (see, e.g.. Pierce Immimotechnology Catalog and 
Handbook. 1991, at A12-A13). 

In certain embodiments, the assay is an enzyme linked immunosorbent 1 assay 
(ELISA). This assay may be performed by first contacting a fixsion polypeptide antigen that 

30 has been immobilized on a solid support, conmionly the well of a microtiter plate, with the 
sample, such that antibodies to the polypeptide within the sample are allowed to bind to the 
immobilized polypeptide. Unboxmd sample is then removed fi-om the immobilized 
polypeptide and a detection reagent capable of binding to the immobilized antibody- 
polypeptide complex is added. The amount of detection reagent that remains bound to the 

35 soUd support is then determined using a method appropriate for the specific detection 
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reagent. 

More specifically, once the polypeptide is immobilized on the support as described 
above, the remaining protein binding sites on the support are typically blocked. Any 
suitable blocking agent known to those of ordinary skill in the art, such as bovine serum 

5 albumin or Tween 20™ (Sigma Chemical Co,, St. Louis, MO) may be employed. The 
immobilized polypeptide is then incubated with the sample, and antibody is allowed to bind 
to the antigen. The sample may be diluted with a suitable diluent, such as phosphate- 
buffered saline (PBS) prior to incubation. In general, an appropriate contact time is that 
period of time that is sufficient to detect the presence of antibody within a M tuberculosis- 

10 infected sample. Preferably, the contact time is sufficient to achieve a level of binding that 
is at least 95% of that achieved at equilibrium between bound and unboimd antibody. Those 
of ordinary skill in the art will recognize that the time necessary to achieve equilibrium may 
be readily determined by assaying the level of binding that occurs over a period of time. At 
room temperature, an incubation time of about 30 minutes is generally sufficient. 

15 Unbound sample may then be removed by washing the solid support with an 

appropriate buffer, such as PBS containing 0. 1% Tween 20™. Detection reagent may then 
be added to the solid support. An appropriate detection reagent is any compoimd that binds 
to the immobilized antibody-polypeptide complex and that can be detected by any of a 
variety of means known to those in the art. Preferably, the detection reagent contains a 

20 binding agent (for example, Protein A, Protein G, lectin or free antigen) conjugated to a 
reporter group. Preferred reporter groups include enzymes (such as horseradish peroxidase), 
substrates, cofactors, inhibitors, dyes, radionuclides, luminescent groups, fluorescent 
groups, biotin and colloidal particles, such as colloidal gold and selenium. The conjugation 
of binding agent to reporter group may be achieved using standard methods known to those 

25 of ordinary skill in the art. Common binding agents may also be purchased conjugated to a 
variety of reporter groups from many commercial sources (e.g,. Zymed Laboratories, San 
Francisco, CA, and Pierce, Rockford. XL). 

The detection reagent is then incubated with the immobilized antibody-polypeptide 
complex for an amount of time sufficient to detect the bound antibody. An appropriate 

30 amount of time may generally be determined from the manufacturer's instructions or by 
assaying the level of binding that occurs over a period of time. Unbound detection reagent 
is then removed and bound detection reagent is detected using the reporter group. The 
method employed for detecting the reporter group depends upon the nature of the reporter 
group. For radioactive groups, scintillation counting or autoradiographic methods are 

35 generally appropriate. Spectroscopic methods may be used to detect dyes, luminescent 
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groups and fluorescent groups. Biotin may be detected using avidin, coupled to a different 
reporter group (commonly a radioactive or fluorescent group or an enzyme). Enzyme 
reporter groups may generally be detected by the addition of substrate (generally for a 
specific period of time), followed by spectroscopic or other analysis of the reaction 
5 products. 

To determine the presence or absence of anti -M. tuberculosis antibodies in the 
sample, the signal detected from the reporter group that remains bound to the solid support 
is generally compared to a signal that corresponds to a predetermined cut-off value. In one 
preferred embodiment, the cut-off value is the average mean signal obtained when the 

10 immobilized antigen is incubated with samples from an iminfected patient. In general, a 
sample generating a signal that is three standard deviations above the predetermined cut-off 
value is considered positive for tuberculosis. In an alternate preferred embodiment, the cut- 
off value is determined using a Receiver Operator Curve, according to the method of 
Sackett et al., 1985, Clinical Epidemiology: A Basic Science for Clinical Medicine, Little 

15 Brown and Co., pp. 106-107. Briefly, in this embodiment, the cut-off value may be 
determined from a plot of pairs of true positive rates (Le., sensitivity) and false positive 
rates (100%-specificity) that correspond to each possible cut-off value for the diagnostic test 
result. The cut-off value on the plot that is the closest to the upper left-hand comer the 
value that encloses the largest area) is the most accurate cut-off value, and a sample 

20 generating a signal that is higher than the cut-off value determined by this method may be 
considered positive. Altematively, the cut-off value may be shifted to the left along the plot, 
to minimize the false positive rate, or to the right, to minimize the false negative rate. In 
general, a sample generating a signal that is higher than the cut-off value determined by this 
method is considered positive for tuberculosis. 

25 In a related embodiment, the assay is performed in a rapid flow-through or strip test 

format, wherein the antigen is immobilized on a membrane, such as nitrocellulose. In the 
flow-through test, antibodies within the sample bind to the immobilized polypeptide as the 
sample passes through the membrane. A detection reagent (e,g,, protein A-coUoidal gold) 
then binds to the antibody-polypeptide complex as the solution containing the detection 

30 reagent flows through the membrane. The detection ofboxmd detection reagent may then 
be performed as described above. In the strip test format, one end of the membrane to 
which polypeptide is bound is immersed in a solution containing the sample. The sample 
migrates along the membrane through a region containing detection reagent and to the area 
of immobilized polypeptide. Concentration of detection reagent at the polypeptide indicates 

35 the presence of anti- M tuberculosis antibodies in the sample. Typically, the concentration 
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of detection reagent at that site generates a pattern, such as a line, that can be read visually. 
The absence of such a pattern indicates a negative result. In general, the amount of 
polypeptide immobilized on the membrane is selected to generate a visually discernible 
pattern when the biological sample contains a level of antibodies that would be sufficient to 

5 generate a positive signal in an ELISA, as discussed above. Preferably, the amount of 
polypeptide immobilized on the membrane ranges from about 5 ng to about 1 ^g, and more 
preferably firom about 50 ng to about 500 ng. Such tests can typically be performed with a 
very small amount (e.g., one drop) of patient serum or blood. 

The invention having been described, the following examples are offered by way of 

10 illustration and not limitation. 

6. EXAMPLE: FUSION PROTEINS OF M. TUBERCULOSIS ANTIGENS 
RETAIN IMMUNOGENICITY OF THE INDIVIDUAL 
COMPONENTS 

15 

6.1. MATERIALS AND METHODS 

6.1.1. CONSTRUCTION OF FUSION PROTEINS 

Coding sequences of M tuberculosis antigens were modified by PGR in order to 

20 facilitate their fusion and subsequent expression of fiision protein. DNA amplification was 
performed using 10 ^1 lOX Pfii buffer, 2 ^1 10 mM dNTPs, 2 ^il each of the PGR primers at 
10 ^iM concentration, 81.5 |li1 water, 1.5 ^1 Pfii DNA polymerase (Stratagene, La JoUa, GA) 
and 1 ^l DNA at either 70 ng/^l (for TbRa3 antigen) or 50 ng/^l (for 38 kD and Tb38'l 
antigens). For TbRa3 antigen, denaturation at 94''C was performed for 2 min, followed by 

25 40 cycles of 96°C for 15 sec and 72^G for 1 min, and lastly by 72°G for 4 min. For 38 kD 
antigen, denaturation at 96*'G was performed for 2 min, followed by 40 cycles of 96'*C for 
30 sec, 68°C for 15 sec and 72^G for 3 min, and finally by 72^G for 4 min. For Tb38-1 
antigen, denaturation at 94°C for 2 min was followed by 10 cycles of 96°G for 15 sec, 68°C 
for 15 sec and 72X for 1.5 mm, 30 cycles of 96^G for 15 sec, 64*'G for 15 sec and 72X for 

30 1 .5, and finally by 72°G for 4 min. 

Following digestion with a restriction endonuclease to yield the desired cohesive or 
blunt ends, a polynucleotide specific for each fusion polypeptide was ligated into an 
expression plasmid. Each resulting plasmid contained the coding sequences of the 
individual antigens of each fusion polypeptide. The expression vectors used were pET-12b 

35 and pT7^L2ILl. 
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Three coding sequences for antigens Ral25 TbH9 and Ra35 were ligated to encode 
one fusion protein (SEQ ID N0S:1 and 2) (Fig. lA and 2B). Another three coding 
sequences for antigens Erdl4, DPV and MTI were ligated to encode a second fusion protein 
(SEQ ID N0S:3 and 4) (Fig. 2). Three coding sequences for antigens TbRa3, 38kD and 

5 Tb38-1 were ligated to encode one fusion protein (SEQ ID N0S:5 and 6) (Fig. 3A - 3D). 
Two coding sequences for antigens TbH9 and Tb38-1 were ligated to encode one fusion 
protein (SEQ ID N0S:7 and 8) (Fig. 4A - 4D). Four coding sequences for antigens TbRa3, 
38kD, Tb38-1 and DPEP were ligated to encode one fusion protein (SEQ ID NOS:9 and 10) 
(Fig. 5 A - 5 J). Five coding sequences for antigens Erdl4, DPV, MTI, MSL and MTCC2 

10 were ligated to encode one fusion protein (SEQ ID N0S:1 1 and 12) (Fig. 6A and 6B). Four 
coding sequences for antigens Erdl4, DPV, MTI and MSL were ligated to encode one 
fusion protein (SEQ ID N0S:13 and 14) (Fig. 7A and 7B). Four coding sequences for 
antigens DPV, MTI, MSL and MTCC2 were ligated to encode one fusion protein (SEQ ID 
N0S:15 and 16) (Fig. 8A and 8B). Three coding sequences for antigens DPV, MTI and 

15 MSL were ligated to encode one fusion protein (SEQ ID NOS:17 and 18) (Fig. 9A and 9B). 
Three coding sequences for antigens TbH9, DPV and MTI were ligated to encode one 
fusion protein (SEQ ID N0S:19 and 20) (Fig. lOA and lOB). Three coding sequences for 
antigens Erdl4, DPV and MTI were ligated to encode one fusion protein (SEQ ID N0S:21 
and 22) (Fig. 1 1 A and 1 IB). Two coding sequences for antigens TbH9 and Ra35 were 

20 ligated to encode one fusion protein (SEQ ID NOS:23 and 24) (Fig. 12A and 12B). Two 
coding sequences for antigens Ral2 and DPPD were ligated to encode one fusion protein 
(SEQ ID NOS:25 and 26) (Fig. 13A and 13B). 

The recombinant proteins were expressed in E, coli with six histidine residues at the 
amino-terminal portion using the pET plasmid vector (pET-17b) and a T7 RNA polymerase 

25 expression system (Novagen, Madison, WI). £. coli strain BL21 (DE3) pLysE (Novagen) 
was used for high level expression. The recombinant (His-Tag) fusion proteins were 
purified from the soluble supernatant or the insoluble inclusion body of 500 ml of IPTG 
induced batch cultures by affinity chromatography using the one step QIAexpress Ni-NTA 
Agarose matrix (QIAGEN, Chatsworth, C A) in the presence of 8M urea. Briefly, 20 ml of 

30 an ovemight saturated culture of BL21 containing the pET construct was added into 500 ml 
of 2xYT media containing 50 ^ig/ml ampicillin and 34 ^ig/ml chloramphenicol, grown at 
37 °C with shaking. The bacterial cultures were induced with 2mM IPTG at an OD 560 of 
0.3 and grown for an additional 3 h (OD = 1.3 to 1 .9). Cells were harvested from 500 ml 
batch cultures by centrifugation and resuspended in 20 ml of binding buffer (0.1 M sodium 

35 phosphate, pH 8.0; 10 mM Tris-HCl, pH 8.0) containing 2mM PMSF and 20 ^g/ml 
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leupeptin plus one complete protease inhibitor tablet (Boehringer Mannheim) per 25 ml. E, 
coli was lysed by freeze-thaw followed by brief sonication, then spim at 12 k rpm for 30 
min to pellet the inclusion bodies. 

The inclusion bodies were washed three times in 1% CHAPS in 10 mM Tris-HCl 

5 (pH 8.0). This step greatly reduced the level of contaminating LPS. The inclusion body was 
finally solubilized in 20 ml of binding buffer containing 8 M urea or 8M urea was added 
directly into the soluble supernatant. Recombinant fusion proteins with His-Tag residues 
were batch bound to Ni-NTA agarose resin (5 ml resin per 500 ml inductions) by rocking at 
room temperature for 1 h and the complex passed over a colimm. The flow through was 

1 0 passed twice over the same column and the column washed three times with 30 ml each of 
wash buffer (0.1 M sodium phosphate and 10 mM Tris-HCL, pH 6.3) also containing 8 M 
urea. Bound protein was eluted with 30 ml of 150 mM immidazole in wash buffer and 5 ml 
jfractions collected. Fractions containing each recombinant fusion protein were pooled, 
dialyzed against 10 mM TrisHCl (pH 8.0) bound one more time to the Ni-NTA matrix, 

15 eluted and dialyzed in 10 mM Tris-HCL (pH 7.8). The yield of recombinant protein varies 
from 25- 150 mg per liter of induced bacterial culture with greater than 98% purity. 
Recombinant proteins were assayed for endotoxin contamination using the Limulus assay 
(BioWhittaker) and were shown to contain < 10 E.U.Img, 

20 6.1.2. T-CELL PROLIFERATION ASSAY 

Purified fusion polypeptides were tested for the ability to induce T-cell proliferation 
in peripheral blood mononuclear cell (PBMC) preparations. The PBMCs from donors 
known to be PPD skin test positive and whose T-cells were shown to proliferate in response 
to PPD and crude soluble proteins from M tuberculosis were cultured in RPMI 1640 

25 supplemented with 10% pooled human serum and 50 pg/ml gentamicin. Purified 

polypeptides were added in duplicate at concentrations of 0.5 to 10 ng/ml. After six days of 
culture in 96-well round-bottom plates in a volume of 200 ^il, 50 nl of mediimi was 
removed from each well for deteraiination of IFN-y levels, as described below in Section 
6.1.3. The plates were then pulsed with 1 nCi/well of tritiated thymidine for a fiirther 18 

30 hours, harvested and tritium uptake detemiined using a gas scintillation counter. Fractions 
that resulted in proliferation in both replicates three fold greater than the proliferation 
observed in cells cultured in medixmi alone were considered positive, 

6.1.3. INTERFERON-v ASSAY 
35 Spleens from mice were removed asceptically and single cell suspension prepared in 
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complete RPMI following lysis of red blood cells. 100 \x\ of cells (2x10'^ cells) were plated 
per well in a 96-well flat bottom microtiter plate. Cultures were stimulated with the 
indicated recombinant proteins for 24h and the supernatant assayed for IFN-y. 

The levels of supernatant IFN-y was analysed by sandwich ELISA, using antibody 

5 pairs and procedures available from PharMingen. Standard curves were generated using 
recombinant mouse cytokines, ELISA plates (Coming) were coated with 50 nl/well (1 
jig/ml, in 0.1 M bicarbonate coating buffer, pH9.6) of a cj^okine capture mAb (rat anti- 
mouse EFN-Y (PharMingen; Cat. # 181 81D)), and incubated for 4 h at room temp. Shake 
out plate contents and block with PBS-0.05% Tween, 1.0% BSA (200 nl/well) overnight at 

10 4°C and washed for 6X in PBS-0.1% Tween. Standards (mouse IFN-y) and supernatant 
samples diluted in PBS-0.05% Tween, 0.1% BSA were then added for 2 hr at room temp. 
The plates were washed as above and then incubated for 2 hr at room temperature with 100 
^1/well of a second Ab (biotin rat a mouse IFN-y (Cat. #181 12D; PharMingen) at 0.5 
Hg/ml diluted in PBS-0.05% Tween, 0.1% BSA. After washing, plates were incubated with 

15 1 00 nl/well of streptavidin-HRP (Zymed) at a 1 :2500 dilution in PBS-0.05% Tween, 0. 1 % 
BSA at room temp for Ihr. The plates were washed one last time and developed with 100 
^1/well TMB substrate (3,3 ',5,5' — tetramethylbenzidine, Kirkegaard and Perry, 
Gaithersburg, MD) and the reaction stopped after color developed, with H2SO4, 50 fil/well. 
Absorbance (OD) were determined at 450 nm using 570 nm as a reference wavelength and 

20 the cytokine concentration evaluated using the standard curve. 

6.2. RESULTS 

6.2.1. TRI-FUSION PROTEINS INDUCED IMMUNE RESPONSES 
25 Three coding sequences for M. tuberculosis antigens were inserted into an 

expression vector for the production of a fiision protein. The antigens designated Ral2, 
TbH9 and Ra35 were produced as one recombinant fiision protein (Figure 1 A and IB). 
Antigens Erdl4, DPV and MTI were produced as a second fiision protein (Figure 2). The 
two fiision proteins were affinity purified for use in in vitro and in vivo assays. 
30 The two fiision proteins were tested for their ability to stimulate T cell responses 

fix)m six PPD* subjects. When T cell proliferation was measured, both fiision proteins 
exhibited a similar reactivity pattern as their individual components (Figure 14A-14F). A 
similar result was obtained when IFN-y production was measiired (Figure 15A-15F). For 
example, subject D160 responded to antigens TbH9 and MTI individually. Subject D160 
35 also responded to the fiision proteins that contained these antigens (Figure 14B and 15B). 
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In contrast, no T cell response from D160 was observed to other antigens individually. 
Another subject, D201, who did not react with antigens Erdl4, DPV or MTI individually, 
was also unresponsive to the fusion protein containing these antigens. It should be noted 
that when the T cell responses to the individual components of the two fusion proteins were 

5 not particularly strong, the fusion proteins stimulated responses that were equal to or higher 
than that induced by the individual antigens in most cases. 

The Ral2-TbH9-Ra35 tri-fusion protein was also tested as an immunogen in vivo. 
In these experiments, the fusion protein was injected into the footpads of mice for 
immtmization. Each group of three mice received the protein in a different adjuvant 

10 formulation: SBASlc, SBAS2 (Ling et al, 1997, Vaccme 15:1562-1567), SBAS7 and 
AL(0H)3. After two subcutaneous immunizations at three week intervals, the animals were 
sacrificed one week later, and their draining lymph nodes were harvested for use as 
responder cells in T cell proliferation and cytokine production assays. 

Regardless which adjuvant was used in the immunization, strong T cell proliferation 

15 responses were induced against TbH9 when it was used as an individual antigen (Figure 
16A). Weaker responses were induced against Ra35 and Ral2 (Figure 16B and 16C), 
When the Ral2-TbH9-Ra35 fusion protein was used as immunogen, a response similar to 
that against the individual components was observed. 

When cytokine production was measured, adjuvants SBASlc and SBAS2 produced 

20 similar IFN-y (Figure 1 7) and IL-4 responses (Figure 1 8). However, the combination of 
SBAS7 and aluminum hydroxide produced the strongest IFN-y responses and the lowest 
level of IL-4 production for all three antigens. With respect to the hxmioral antibody 
response in v/vo, Figure 19A-19F shows that the fusion protein elicited both IgGj and IgG2a 
antigen-specific responses when it was used with any of the three adjuvants. 

25 Additionally, C57BL/6 mice were immunized with a combination of two expression 

constructs each containing Ral2-TbH9-Ra35 (Mtb32A) or Erdl4-DPV-MTI (Mtb39A) 
coding sequence as DNA vaccines. The immunized animals exhibited significant protection 
against tuberculosis upon a subsequent aerosol challenge of live bacteria. Based on these 
results, a fusion construct of Mtb32A and Mtb39A coding sequences was made, and its 

30 encoded product tested in a guinea pig long term protection model. In these studies, guinea 
pigs were immunized with a single recombinant fusion protein or a mixture of Mtb32A and 
Mtb39A proteins in fomiulations containing an adjuvant. Figure 20A-20C shows that 
guinea pigs inmiimized with the fiision protein in SBASlc or SBAS2 were better protected 
against the development of tuberculosis upon subsequent challenge, as compared to animals 

35 immiinized with the two antigens in a mixture in the same adjuvant formulation. The fusion 
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proteins in SBAS2 formulation afforded the greatest protection in the animals. Thus, fusion 
proteins of various M. tuberculosis antigens may be used as more effective immunogens in 
vaccine formulations than a mixture of the individual components. 

5 6.2.2. BI-FUSION PROTEIN INDUCED IMMUNE RESPONSES 

A bi-fusion fusion protein containing the TbH-9 and Tb38-1 antigens without a 
hinge sequence was produced by recombinant methods. The ability of the TbH9-Tb38-l 
fusion protein to induce T cell proliferation and IFN-y production was examined. PBMC 
from three donors were employed: one donor had been previously shown to respond to 
10 TbH9 but not to Tb38-1 (donor 131); one had been shown to respond to Tb38-1 but not to 
TbH9 (donor 184); and one had been shown to respond to both antigens (donor 201). The 
results of these studies demonstrate the functional activity of both the antigens in the fusion 
protein (Figures 21 A and 21B, 22A and 22B, and 23A and 23B). 

15 6.2.3. A TETRA-FUSION PROTEIN REACTED WITH 

TUBERCULOSIS PATIENT SERA 

A fusion protein containing TbRa3, 38KD antigen, Tb3 8-1 and DPEP was produced 
by recombinant methods. The reactivity of this tetra-fusion protein referred to as TbF-2 
with sera from M tuberculosis-infected patients was examined by ELIS A. The results of 
20 these studies (Table 1) demonstrate that all foxir antigens function independently in the 
fusion protein. 

One of skill in the art will appreciate that the order of the individual antigens within 
each fusion protein may be changed and that comparable activity would be expected 
provided that each of the epitopes is still functionally available. In addition, truncated 
25 forms of the proteins containing active epitopes may be used in the construction of fusion 
proteins. 

The present invention is not to be limited in scope by the exemplified embodiments 
which are intended as illustrations of single aspects of the invention, and any clones, 
nucleotide or amino acid sequences which are functionally equivalent are within the scope 

30 of the invention. Indeed, various modifications of the invention in addition to those 
described herein will become appiarent to those skilled in the art from the foregoing 
description and accompanying drawings. Such modifications are intended to fall within the 
scope of the appended claims. It is also to be imderstood that all base pair sizes given for 
nucleotides are approximate and are used for purposes of description. 

35 All publications cited herein are incorporated by reference in their entirety. 
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Table 1 

Reactivity of TbF-2 Fusion Protein with TB and Normal Sera 



Serum ID 


Status 


TbF 
OD450 


Status 


TbF-2 
OD450 


Status 


ELISA Reactivity 














38 kD 


TbRa3 


Tb38-1 


DPEP 


B931-40 


TB 


0.57 


+ 


0.321 


+ 




+ 




+ 


B931-41 


TB 


0.601 




0.396 


+ 




+ 






B93 1-109 


TB 


0.494 


+ 


0.404 


+ 


+ 


+ 


±± 




B931-132 


TB 


1.502 




1.292 






+ 


+ 


±± 


5004 


TB 


1.806 


+ 


1.666 


+ 


±± 


±± 


•1- 




15004 


TB 


2.862 




2.468 


+ 


+ 








39004 


TB 


2.443 


+ 


1.722 


+ 


+ 


+ 






68004 


TB 


2.871 


+ 


2.575 


4- 


+ 








99004 


TB 


0.691 


+ 


0.971 


+ 




±± 


+ 




107004 


TB 


0.875 


+ 


0.732 


+ 




±± 






92004 


TB 


1.632 


+ 


1.394 


+ 


+ 


±± 


±± 




97004 


TB 


1.491 


+ 


1.979 


+ 


+ 


±± 






118004 


TB 


3.182 


+ 


3.045 




+ 


±± 






173004 


TB 


3.644 


+ 


3.578 




+ 


+ 






175004 


TB 


3.332 


+ 


2.916 


+ 


+ 


+ 






274004 


TB 


3.696 


+ 


3.716 










+ 


276004 


TB 


3.243 


+ 


2.56 


+ 






+ 




282004 


TB 


1.249 


+ 


1.234 


+ 


+ 








289004 


TB 


1.373 


+ 


1.17 


+ 




+ 






308004 


TB 


3.708 




3.355 


+ 






+ 




314004 


TB 


1.663 


+ 


1.399 


+ 






+ 




317004 


TB 


1.163 


+ 


0.92 


+ 


+ 








312004 


TB 


1.709 




1,453 


+ 




+ 






380004 


TB 


0.238 


- 


0.461 


+ 




±± 




+ 


451004 


TB 


0.18 


- 


0.2 


- 








±± 


478004 


TB 


0.188 


- 


0,469 


+ 








±± 


410004 


TB 


0.384 




2.392 




±+ 






+ 


411004 


TB 


0.306 


+ 


0.874 


+ 




+ 




+ 


421004 


TB 


0.357 


+ 


1,456 


+ 




+ 




+ 


528004 


TB 


0.047 


- 


0.196 


- 










A6.87 


Normal 


0.094 


- 


0.063 


- 


_ 








A6-88 


Normal 


0.214 


- 


0.19 


- 










A6-89 


Normal 


0.248 


- 


0.125 


- 


m 








A6-90 


Normal 


0.179 


- 


0.206 


- 


- 


_ 


_ 


_ 


A6-91 


Normal 


0.135 




0.151 


- 










A6-92 


Normal 


0.064 


- 


0.097 


- 










A6-93 


Normal 


0.072 




0.098 












A6-94 


Normal 


0.072 




0.064 












A6-95 


Normal 


0.125 




0.159 












A6-96 


Normal 


0.121 




0.12 
































Cut-off 




0.284 




0.266 













32 



wo 99/51748 



PCT/US99/07717 



WHAT IS CLAIMED IS; 

1 . A purified polypeptide comprising an amino acid sequence selected fi-om the 
group consisting of SEQ ID NOS: 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22 and 24, said amino 
5 acid sequence may optionally contain one or more conservative amino acid substitutions. 



2. A purified polypeptide encoded by a polynucleotide that hybridizes under 
moderately stringent conditions to a second polynucleotide which is cpmplementary to a 
nucleotide sequence that encodes the amino acid sequence selected firom the group 

10 consisting of SEQ ID NOS: 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22 and 24, said amino acid 
sequence induces an immune response to M tuberculosis. 

3. The polypeptide of Claim 2 which is a soluble polypeptide, 

1 5 4. The polypeptide of Claim 2 which is produced by a recombinant DNA 

method. 



The polypeptide of Claim 2 which is produced by a chemical synthetic 



method. 



20 



6. The polypeptide of Claim 2 which induces an antibody response. 

7. The polypeptide of Claim 2 which induces a T cell response, 

25 8. The polypeptide of Claim 2 which is fiised with a second heterologous 

polypeptide. 

9. A method of preventing tuberculosis, comprising administering to a subject 
an effective amoimt of the polypeptide of Claim 1. 

30 

10. A method of preventing tuberculosis, comprising administering to a subject 
an effective amoimt of the polypeptide of Claim 2, 

11. A method of preventing tuberculosis, comprising administering to a subject 
35 an effective amount of a polynucleotide that encodes the polypeptide of Claim 2. 



33 
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12. A pharmaceutical composition comprising the polypeptide of Claim 2. 

13. A pharmaceutical composition comprising a polynucleotide that encodes the 
polypeptide of Claim 2. 

5 



10 



15 



20 



25 



30 



35 



34 
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' — _ 

' rc TiCi-i r r r ; tct r i ic r t T::;ic^i.NCi^-:iriC-Ti rccATC-cCiTC -CCi r: - CiCcccccccTcccATiACTTCC^ccrcTccziC-^rcc 

.tHHHHHHTAASONrCLSCCC 

1 TbRal2 I 

CC^CCCATTCCCCATrcCCATCCCCCiCCCGirCCCCATCCCCCCCC'i.CiTCCSiTCCCCrcCCGCGTCACCCACCC'TC'iTArCCCCCCriCCCCCTrc 

i : > _ — = 2C0 

OCfA \ p I cOAnAlAC0(rtSCCCS ?rvH!CPTif 

iTb Ral2 I 

CrCCCCTTCCCTCTTCTCCACAACAACCCCAACGGCGCACCACTCCAACCCCTCCTCCCGiCCCCTCCCCCCCCAACTCTCCCCATCTCCACCCCrCACG 
. : : _^ 200 

LCLCV VONNCNGARVCavvCSA.PiASLCI STCO 
' TbRa^2 

rCArCACCCCCCTCGACCCCCCTCCGATCAACTCGCCCACCCCCATCCCGClCCCCCrTAACGCCCATCATCCCCCTCACCTCATCTCCCrCACCrcCCA 

< . . > . . . : » ^ «QQ 

V I TAVOCAP tflSATArACALNCHHPCOVlSVrwO 

" Tb Rai2 ■ I 



iccAicrccccccccAccccT ACAGcCAACcrcACATTccccGAcccACCc::cccccAi7rcATccTccAr7rccccccc7TACCACccGACArcAAc 

' • = ' '■ '■ = 500 



'<SCC [3 7CNVTLAiC.= ?A£.-nvOfCAt?. PS IN 
■TbRal2 1 I 'TbHQ ■ 



TCCCCCACCArC TACGCCCCCCCGCCrrCGCCCTCCCTGGTGCCCGCCGCrZAGArcrGGCACACCCrcCCC.ACrCACCTCTr'TCCCCCCCCTCGCCCT 

■■ = , : : 5QQ 

S A n Y A c P C S A S L V A A A C rt V 0 S • • V A S 0' L ? S A A 5 i 

Tb H9 ' — . 

riCAGTCGGrccr c :cccctc rcAcccTccccTccrccATAGG7rcGrcc::::crcTCArGCtccc'cccccccTCCCCCTATCTGCCCTccATCAc:c: 

■ . __ . 



•2 S V V V C L 7 V C 3 w { c 5 S : C L ft V A A A S P Y V A V ft S V 

Tb H9 



C-CCGCCGGCCA CGCCCACCTCACCGCCCCCCAGCTCCCCGrTCCrCCCCC:::CTACCACACCCCCTA7CCCCTCACCCTCCCCCCCCCGGTCA7CGCC 
— • ' — ' 3CX) 



*-C0 i£L rAACVRVAAAiY£7AYCl.rV?P?Vl A 

' Tb H9 • ' 



CACAACCCrcCrC AACrCATCA; ■ •:TCArACCCACCAACCTCTTCCC':C--->:ACCCCGCCCATCCCCGTCAACCAGCCCCAATACCCCCAGArCTCGG 

: : . ^ 900 



I L I A r fj L L C C r ? A I A V M ^ A £ r c 

TbH9 ■ 



CCC^ACACGCCC C':GCGA:Gr:7CCCTACCCCCCCCCCACCCCGACCCCC-C ::CCACC7-CC:CCCCr7CCAGCACCCCCCCCACArCACC ACCCCCCC 

— — — f 1^^ — : ' 1000 

- C OA a Ar. C Y A^ -A A r A T A F A F L L?F I Z ^ P £ M-T S AC 

' Tb H9 ■ 

TGGCCrCCTCCACC ACCCCCCCCCGGTCCACCACCCCTCCGACACCCCCCC:c:CAACCACrTCATGAACAATGTCCCCCACCCCCTGCAACACCTCCCC 

— • : ' . . : , . . 1100 

^'-LE Oaa AVE£ASOt AiAN0LnNNVPGAl.OQl.A 

Tb H9 ■ 



CAGCCCACCCACCCC ACCACCCCTTCTTCCAACCrCGCTCGCC7CTGGAACiCCCTCTCCCCCCiTCCCTCGCCGATCAGCAACATCCTGTCGArCCCCA 

= — > ^ > ^- .. 1200 

°'*^0CTTPsSlCLCCLV)crvs?H35PISNnvSfiA 

I Tb H9 ' 

ACAACCACATCTCCArG ACCAACTCGCC7CTCTCCArGACCAACACCTTGAC:rcCATCT7CAACCCCTTrcCTCCCCCCCCCCCCCCCCACCCCCTCCA 

' * ' — > . ' 1300 

^ " S M TNSCvSnTHlLSSnLXCrAPAAAAOAVO 

' Tb H9 I 

AACCCCCCCCCAAAACC3CC rCCCCGCCATCAGC7CCCTCCCCACCTCCCTC:GrTCTrcGCCTC7CGCCCCTCCCCTCCCCCCCAACTTCCCTCCCCCC 

• ■' . . — , . moo 



^ ^ 0 C7 0An5SLCS5LC55CLCCCVAANLCaA 

To H9 



G:'.rccc7cccr:cc:rc7 ccc:GCCGCACccc:ccG:cGCCc:c--C':-:::-crc-:cc-GcccccccGcccccrcccccTCACCAcccrcAC^ 

^ • w ,500 

^ ^5LSvpOA VAAA;;Civfp::iPAi.Pl.r5Lrs 



' TbH9 
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:cccccAAACASCCCCCccccA;.7ecrccseccccTcccce.:ccccAGAT:::c::cAsccccccTccTssc :rcASTscTCTC^^ 

. A E a C P G 0 H L C C U f V = C .-. C A S A G C C L 5 C V L 3 v P P 



■ToH9 

:CGACCCTATCTCATCCCCCArTCTCCGCCAGCCGCCGATATCCCCCCGCCCG:CrT;rCuCAGCACCGGTTCG:CCACT7CCCCuCC:rCC:C:TC^ 



„ p r V n P H 3 P A A C 0 I A P P ^ L S Q 0 s f X O f ? ^ L P L 
.TbH9 ! — "T^f^S 



I U 1-13 — A 

CCCrCCCCCATCGTCCCCCAACrCCGCCCACACGTCCTCAACATCAACACCAA.CTCGCCTACAACAACCCCCTCCCCC CCCCCACC^^^ ^^^^ 
p S A n V A 0 V G P 0 V V N I N T X U C Y N N A V C A C T C I V I 

■ TbRa35 



ATCCCAACCCTCTCGTCCTGACCAACAACCACCTCATCGCGCCCCCCACCCACATCAATGCCTTCACCCTCGCCTCCCG^ ^^^^ 
0 P N C V V L T N N H V t A C A T 0 I ^ A F 5 V C 5 C Q T Y C V 0 V 

cctccgctatgaccccacccaccatctcgccctcctccagctccccgctgccc:tgccctaccatcccccgccatcggtcccgccctcgccgttcctcac ^^^^ 

V C r 0 R T Q 0 V A V L Q L R C A C C L P 5 ^ ' C C C V A V G £ 

CCCTTCG7CGCGATGGGCAACACCCCTCCCCACCGCGGAACGCCCCCTCCGGTGCCTCCCACCCTCCTCCCCCTCCCCCAAACCCTGCAGGCCrCCGA77 ^^^^ 

P F V A r: C N i"G G 0 C G T P R A V ? G a V V ^ ^ ^ , P ' ^ 0 ^ S 0 

«TbRa35 



CGCrCACCCCrCCCGAACACACArTCAACCCGTTCATCCACTTCCATCCCGCGArCCACCCCCGTGlTYcCCGCGGCGCCCTCCTCAACCCCCTACCACA ^^^^ 

S L T C A £ r 7 L N G L I Q F 0 A A I G ? C 0 S C G P V ^ ^ C ^ C G 

• TbRa35 



CGTCCrCGCrATGAACACCCCCGCGTCCTACCATATCCATCACACTGGCCGCCC-CCACCACATCCCCNTCrAACAAAGCCCGAAA 



2237 



V V C « T A A* S 

— — — To Ra35 > 
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CirATACATAiccAiCACCATCAccATCACArccccACCACCCi rcccc':cACCGCc^c:ccccGTCccTcrTcccccACTTTrc7CAs::c: :c::cs: 
, . = 

nHHMHHHftATTLr>VORHPRSUFP£FStLrA 
I ' 'ERDM 

CCTTCCCCTCATTCCCCCCACTCCGCCCCACCTTCCACACCCGCnCATGCCCCTCCAAGACCAGATGAAACACCCCCCCTACCAGCTACCCCCCCACCT 
' ' ' ' ^ '• — ^ 200 

AFPSFACLRPTrOTRLnRUEOEttKECRTEVRAEL 

" EflO 14 ■ 

TCCCCCCGTCGACCCCCACAACGACCTCCACATTATGCTCCCCCATGGTCACCTGACCATCAAGGCCCACCGCACCGACCACAAGCACTTCCACCCTCCC 
' ' T"- ' ' ' ' ' 300 

PCvOPOKDVOInvROCQLTlKAERTEGKOFOCR 

TCCCAATTCCCCTACCCTTCCTTCCTTCGCACCCTCTCGCTCCCCGTACCTCCTCACCACCACCACATTAACCCCACCTACCACAACCCCATTCTTACTG 
— ' ■ . — ^ ^ . . . (( 00 

StrAYCSFVRTVSUPVCAOEOOlK ATYOKGILT 

gRD 14 . 

TCTCCCTCGCCCTTTCCGAAGGCAACCCAACCCAAAACCACATTCAGATCCCCTCCACCAACAAGCTTCAiCCCCTCCACCCCCTCATTAACACCACCTC 

■ ' ■ ' ' ' ' ■ 500 

VSVAVStCKPTEKHlQiRSTN K L D PVOavInTTC 
■' ERD14 I Hindu f DPV ■ 

CAATTACGCCCACCTACrAGCTCCCCTCAACCCCACCCATCCCCCCCCrCCCCCACACTTCAACCCCTCACCCCTCCCCCACTCCTATTTCCCCAATTTC 

— ^ ■— ^ : — . . 1 . , . 600 

- TfCOvVAALNA TOPCAAAOFfJASPVAQSYURNF - 

DPV ■ 

CTCCCCCCACCGCCACCTCACCGCCCTCCCATGCCCCCCCAATrcCAACCTCTCCCCCCCCCCCCACACTACATCCCCCTTCTCCACTCCCTTCCCCCCT 
— ^ ' . ' > . ' 700 

•-"A PPPQRAAnAAOLOAVPCAAO T I CLV£5VAC 

' DPV I 

CCTCCAACAA CTATCACCTCATCACGATTAffTTACCAGTTCCCCCACGrcCACCCTCATGCCCCCATCATCCGCGCTCACCCCCCCTCtCTTCACCCCCA 

— ' — . , , . 1 ■ 600 

S C N N r E i n T I NrOFCOVOAHCAfllRAQAASLEAE 
——DPV — i tSacl I ' 'MTI 

GCATCAGGCCATCG TTCGTCATGTGTTGCCCCCGGCTCACTTTTCCGCCCCCCCCGGTTCCCTGCCTTCCCAGCACTTCATTACCCACTTCCCCCCTAAC 
— ^ . ^ . > : . ^ . . ^ 

J VR0VLAAC0FWCCAC5VACQEF I TQLCRN 

MTI 

TTCCACCTCATC TACGAGCAGCCCAACCCCCACGCGCACAACCTGCACCCTCCCCGCAACAACATGCCCCAAACCGACACCCCCCTCCCCTCCAGCTCGC 

* * ' ' ' ' 1000 

t ^eqanahcokvoaackhmaqtosavcssw 

MTI ■ 

ccactactaa ccgccgccagtctgctgcaattctccacatatccatcacactccccgcccctccaccagatccccctccta 

' — ^ ^ ^ : ^ ,031 
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TGTTCTTCGA CGGCAGGCTG GTGGAGGAAG GGCCCACCGA ACAGCTGTTC TCCTCGCCGA 
60 

AGCATGCGGA AACCGCCCGA TACGTCGCCG GACTGTCGGG GGACGTCAAG GACGCCAAGC 
120 

GCGGAAATTG AAGAGCACAG AAAGGTATGG C GTG AAA ATT CGT TTG CAT ACG 
172 

Val Lys lie Arg Leu His Thr 
1 5 

CTG TTG GCC GTG TTG ACC GCT GCG CCG CTG GTG CTA GCA GCG GCG GGC 
220 

Leu Leu Ala Val Leu Thr . Ala Ala Pro Leu Leu Leu Ala Ala Ala Gly 
10 15 20 

TGT GGC TCG AAA CCA CCG AGC GGT TCG CCT GAA ACG GGC GCC GGC GCC 
268 

Cys Gly Ser Lys Pro Pro Ser Gly Ser Pro Glu Thr Gly Ala Gly Ala 
25 30 35 

GGT ACT GTC GCG ACT ACC CCC GCG TCG TCG CCG GTG ACG TTG GCG GAG 

Gly Thr Val Ala Thr Thr Pro Ala Ser Ser Pro Val Thr Leu Ala Glu 
40 45 50 55 

ACC GGT AGC ACG CTG CTC TAC CCG CTG TTC AAC CTG TGG GGT CCG GCC 
364 

Thr Gly Ser Thr Leu Leu Tyr Pro Leu Phe Asn Leu Trp Gly Pro Ala 

60 65 70 

TTT CAC GAG AGG TAT CCG AAC GTC ACG ATC ACC GCT CAG GGC ACC GGT 
412 

Phe His Glu Arg Tyr Pro Asn Val Thr He Thr Ala Gin Gly Thr Gly 
75 80 85 

TCT GGT GCC GGG ATC GCG CAG GCC GCC GCC GGG ACG GTC AAC ATT GGG 
460 

Ser Gly Ala Gly He Ala Gin Ala Ala Ala Gly Thr Val Asn He Gly 
90 95 100 
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GCC TCC GAC GCC TAT CTG TCG GAA GGT GAT ATG GCC GCG CAC AAG GGG 
508 

Ala Ser Asp Ala Tyr Leu Ser Glu Gly Asp Met Ala Ala His Lys Gly 
105 110 115 

CTG ATG AAC ATC GCG CTA GCC ATC TCC GCT CAG CAG GTC AAC TAG AAC 
556 

Leu Met Asn lie Ala Leu Ala lie Ser Ala Gin Gin Val Asn Tyr Asn 
120 125 130 135 

CTG CCC GGA GTG AGC GAG CAC CTC AAG CTG AAC GGA AAA GTC CTG GCG 
604 

Leu Pro Gly Val Ser Glu His Leu Lys Leu Asn Gly Lys Val Leu Ala 

140 145 150 

GCC ATG TAC CAG GGC ACC ATC AAA ACC TGG GAC GAC CCG CAG ATC GCT 
652 

Ala Met Tyr Gin Gly Thr He Lys Thr Trp Asp Asp Pro Gin He Ala 
155 160 165 

GCG CTC AAC CCC GGC GTG AAC CTG CCC GGC ACC GCG GTA GTT CCG CTG 
700 

Ala Leu Asn Pro Gly Val Asn Leu Pro Gly Thr Ala Val Val Pro Leu 
170 175 180 

CAC CGC TCC GAC GGG TCC GGT GAC ACC TTC TTG TTC ACC CAG TAC CTG 
748 

His Arg Ser Asp Gly Ser Gly Asp Thr Phe Leu Phe Thr Gin Tyr Leu 
185 190 195 

TCC AAG CAA GAT CCC GAG GGC TGG GGC AAG TCG CCC GGC TTC GGC ACC 
796 

Ser Lys Gin Asp Pro Glu Gly Trp Gly Lys Ser Pro Gly Phe Gly Thr 
200 205 210 215 

ACC GTC GAC TTC CCG GCG GTG CCG GGT GCG CTG GGT GAG AAC GGC AAC 
844 

Thr Val Asp Phe Pro Ala Val Pro Gly Ala Leu Gly Glu Asn Gly Asn 

220 225 230 

GGC GGC ATG GTG ACC GGT TGC GCC GAG ACA CCG GGC TGC GTG GCC TAT 
892 

Gly Gly Met Val Thr Gly Cys Ala Glu Thr Pro Gly Cys Val Ala Tyr 
235 240 245 
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ATC GGC ATC AGC TTC CTC GAG GAG GCC AGT CAA CGG GGA CTC GGC GAG 
940 

lie Gly lie Ser Phe Leu Asp Gin Ala Ser Gin Arg Gly Leu Gly Glu 
250 255 260 

GCC CAA CTA GGC AAT AGC TCT GGC AAT TTC TTG TTG CCC GAC GCG CAA 
988 

Ala Gin Leu Gly Asn Ser Ser Gly Asn Phe Leu Leu Pro Asp Ala Gin 

265 270 275 

AGC ATT CAG GCC GCG GCG GCT GGC TTC GCA TCG AAA ACC CCG GCG AAC 
1036 

Ser lie Gin Ala Ala Ala Ala Gly Phe Ala Ser Lys Thr Pro Ala Asn 
280 285 290 295 

CAG GCG ATT TCG ATG ATC GAC GGG CCC GCC CCG GAC GGC TAC CCG ATC 
1084 

Gin Ala lie Ser Met lie Asp Gly Pro Ala Pro Asp Gly Tyr Pro lie 

300 305 310 

ATC AAC TAC GAG TAC GCC ATC GTC AAC AAC CGG CAA AAG GAC GCC GCC 
1132 

lie Asn Tyr Glu Tyr Ala lie Val Asn Asn Arg Gin Lys Asp Ala Ala 
315 320 325 

ACC GCG CAG ACC TTG CAG GCA TTT CTG CAC TGG GCG ATC ACC GAC GGC 
1180 

Thr Ala Gin Thr Leu Gin Ala Phe Leu His Trp Ala lie Thr Asp Gly 
330 335 340 

AAC AAG GCC TCG TTC CTC GAC CAG GTT CAT TTC CAG CCG CTG CCG CCC 
1228 

Asn Lys Ala Ser Phe Leu Asp Gin Val His Phe Gin Pro Leu Pro Pro 
345 350 355 

GCG GTG GTG AAG TTG TCT GAC GCG TTG ATC GCG ACG ATT TCC AGC 
1273 

Ala Val Val Lys Leu Ser Asp Ala Leu lie Ala Thr lie Ser Ser 
360 365 370 

TAGCCTCGTT GACCACCACG CGACAGCAAC CTCCGTCGGG CCATCGGGCT GCTTTGCGGA 
1333 

GCATGCTGGC CCGTGCCGGT GAAGTCGGCC GCGCTGGCCC GGCCATCCGG TGGTTGGGTG 
1393 
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GGATAGGTGC GGTGATCCCG 
1453 

AGGCGATGGG TGCGATCAGG 
1513 

CAGGCAACAC CTACGGCGAA 
1573 

CTACGGGGCG TTGCCGCTGA 
1633 

CGCGGTGCCG GTCTCTGTAG 
1593 

GGCCGAGGCT GTGGGAATAG 
1753 

TTTGTGGGGG GCAATGACGT 
1813 

TCACAACGCT CCCGATGTGC 
1873 

GGGCATGTTG GTGTCCGGTC 
1933 

CACTCATGAC CTGTTCCGGC 
1993 



7/ 47 

CTGCTTGCGC TGGTCTTGGT 
CTCAACGGGT TGCATTTCTT 
ACCGTTGTCA CCGACGCGTC 
TCGTCGGGAC GCTGGCGACC 
GAGCGGCGCT GGTGATCGTG 
TCCTGGAATT GCTCGCCGGA 
TCGGGCCGTT CATCGCTCAT 
CGGTGCTGAA CTACTTGCGC 
TGGTGTTGGC GGTGATGGTC 
AGGTGCCGGT GTTGCCCCGG 



GCTGGTGGTG CTGGTCATCG 
CACCGCCACC GAATGGAATC 
GCCCATCCGG TCGGCGCCTA 
TCGGCAATCG CCCTGATCAT 
GAACGGCTGC CGAAAC66TT 
ATCCCCAGCG TGGTCGTCGG 
CACATCGCTC- CGGTGATCGC 
GGCGACCCGG: GCAACGGGGA 
GTTCCCATTA TCGCCACCAC 
GAGGGCGCGA TCGGGAATTC 
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GGTCTTGACC ACCACCTGGG TGTCGAAGTC GGTGCCCGGA TTGAAGTCCA GGTACTCGTG 60 

GGTGGGGCGG GCGAAACAAT AGCGACAAGC ATGCGAGCAG CCGCGGTAGC CGTTGACGGT 120 

GTAGCGAAAC GGCAACGCGG CCGCGTTGGG CACCTTGTTC AGCGCTGATT TGCACAACAC 180 

CTCGTGGAAG GTGATGCCGT CGAATTGTGG CGCGCGAACG CTGCGGACCA GGCCGATCCG 240 

CTGCAACCCG GCAGCGCCCG TCGTCAACGG GCATCCCGTT CACCGCGACG GCTTGCCGGG 300 

CCCAACGCAT ACCATTATTC GAACAACCGT TCTATACTTT GTCAACGCTG GCCGCTACCG 360 

AGCGCCGCAC AGGATGTGAT ATGCCATCTC TGCCCGCACA GACAGGAGCC AGGCCTTATG 420 

ACAGCATTCG GCGTCGAGCC CTACGGGCAG CCGAAGTACC TAGAAATCGC CGGGAAGCGC 480 

ATGGCGTATA TCGACGAAGG CAAGGGTGAC GCCATCGTCT TTCAGCACGG CAACCCCACG 540 

TCGTCTTACT TGTGGCGCAA CATCATGCCG CACTTGGAAG GGCTGGGCCG GCTGGTGGCC 600 

TGCGATCTGA TCGGGATGGG CGCGTCGGAC AAGCTCAGCC CATCGGGACC CGACCGCTAT 660 

AGCTATGGCG AGCAACGAGA CTTTTTGTTC GCGCTCTGGG ATGCGCTCGA CCTCGGCGAC 720 

CACGTGGTAC TGGTGCTGCA CGACT6GGGC TCGGCGCTCG GCTTCGACTG GGCTAACCAG 780 

CATCGGGACC GAGTGCAGGG GATCGCGTTC ATGGAAGCGA TCGTCACCCC GATGACGTGG 840 

GCGGACTGGC CGCCGGCCGT GCGGGGTGTG TTCCAGGGTT TCCGATCGCC TCAAGGCGAG 900 

CCAATGGCGT TGGAGCACAA CATCTTTGTC GAACGGGTGC TGCCCGGGGC GATCCTGCGA 960 

CAGCTCAGCG ACGAGGAAAT GAACCACTAT CGGCGGCCAT TCGTGAACGG CGGCGAGGAC 1020 

CGTCGCCCCA CGTTGTCGTG GCCACGAAAC CTTCCAATCG ACGGTGAGCC CGCCGAGGTC 1080 

GTCGCGTTGG TCAACGAGTA CCGGAGCTGG CTCGAGGAAA CCGACATGCC GAAACTGTTC 1140 

ATCAACGCCG AGCCCGGCGC GATCATCACC GGCCGCATCC GTGACTATGT CAGGAGCTGG 1200 

CCCAACCAGA CCGAAATCAC AGTGCCCGGC GTGCATTTCG TTCAGGAGGA CAGCGATGGC 1260 
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GTCGTATCGT 


GGGCGGGCGC 


TCGGCAGCAT 


CGGCGACCTG 


GGAGCGu 1^1 




1 -a "5 A 


GACCAAGAAT 


GTGATTTCCG 


GCGAAGGCGG 


CGCCCTGCTT 






noon 


GCTCCGGGCA 


GAGATTCTCA 


GGGAAAAGGG 


CACCAATCGC 


AGCCGuTl t*u 






GGTCGACAAA 


TATACGTGGC 


AGGACAAAGG 


TCTTCCTATT 




nil iVa 1 \.>V7^ 1 




GCCTTTCTAT 


GGGCTCAGTT 


CGAGGAAGCC 


GAGCGGATCA 


CGCGTATCCG 


ATTGGACCTA 


1560 


TGGAACCGGT 


ATCATQAAAG 


CTTCGAATCA 


TTGGAACAGC 


GGGGGCTCCT 


GCGCCGTCCG 


1620 


ATCATCCCAC 


AGGGCTGCTC 


TCACAACGCC 


CACATGTACT 


ACGTGTTACT 


AGCGCCCAGC 


1680 


GCCGATCGGG 


AGGAGGTGCT 


GGCGCGTCTG 


ACGAGCGAAG 


GTATAGGCGC 


GGTCTTTCAT 


1740 


TACGTGCCGC 


TTCACGATTC 


GCCGGCCGGG 


CGTCGCT 






xin 
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TbH-9: protein sequence 



Val Ala Trp Met Ser Val Thr Ala Gly Gin Ala Glu Leu Thr Ala Ala 
15 10 15 

Gin Val Arg Val Ala Ala Ala Ala Tyr Glu Thr Ala Tyr Gly Leu Thr 
20 25 30 

Val Pro Pro Pro Val lie Ala Glu Asn Arg Ala Glu Leu Met lie Leu 
35 40 45 

lie Ala Thr Asn Leu Leu Gly Gin Asn Thr Pro Ala lie Ala Val Asn 
50 55 60 

Glu Ala Glu Tyr Gly Glu Met Trp Ala Gin Asp Ala Ala Ala Met Phe 
65 70 75 80 

Gly Tyr Ala Ala Ala Thr Ala Thr Ala Thr Ala Thr Leu Leu Pro Phe 
85 90 95 

Glu Glu Ala Pro Glu Met Thr Ser Ala Gly Gly Leu Leu Glu Gin Ala 
100 105 110 



Ala Ala Val Glu Glu Ala Ser Asp Thr Ala Ala Ala Asn Gin Leu Met 
115 120 125 

Asn Asn Val Pro Gin Ala Leu Lys Gin Leu Ala Gin Pro Thr Gin Gly 
130 135 140 

Thr Thr Pro Ser Ser Lys Leu Gly Gly Leu Trp Lys Thr Val Ser Pro 
145 150 155 160 

His Arg Ser Pro lie Ser Asn Met Val Ser Met Ala Asn Asn His Met 
165 170 175 



Ser Met Thr Asn Ser Gly Val Ser 
180 

Leu Lys Gly Phe Ala Pro Ala Ala 
195 200 



Met Thr Asn Thr Leu Ser Ser Met 
185 190 

Ala Ala Gin' Ala Val Gin Thr Ala 
205 



Ala Gin Asn Gly Val Arg Ala Met Ser Ser Leu Gly Ser Ser Leu Gly 
210 215 220 
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Ser Ser Gly Leu Gly Gly Gly Val Ala Ala Asn Leu Gly Arg Ala Ala 
225 230 235 240 

Ser Val Arg Tyr Gly His Arg Asp Gly Gly Lys Tyr Ala Xaa Ser Gly 
245 250 255 

Arg Arg Asn Gly Gly Pro Ala _ ^« , 

260 Tb38-1: protein sequence 



Thr Asp Ala Ala Thr Leu Ala Gin Glu Ala Gly Asn Phe Glu TVrg lie 
1 5 10 15 

Ser Gly Asp Leu Lys Thr Gin lie Asp Gin Val Glu Ser Thr Ala Gly 
20 25 30 

Ser Leu Gin Gly Gin Trp Arg Gly Ala Ala Gly Thr Ala Ala Gin Ala 
35 40 ' 45 

Ala Val Val Arg Phe Gin Glu Ala Ala Asn Lys Gin Lys Gin Glu Leu 
50 55 60 

Asp Glu lie Ser Thr Asn lie Arg Gin Ala Gly Val Gin Tyr Ser Arg 
65 70 75 80 

Ala Asp Glu Glu Gin Gin Gin Ala Leu Ser Ser Gin Met Gly Phe 
85 90 95 
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TGGCGAATGG GACGCGCCCT GTAGCGGCGC ATTAAGCGCG GCGGGTGTGG TGGTTACGCG 60 

CAGCGTGACC GCTACACTTG CCAGCGCCCT AGCGCCCGCT CCTTTCGCTT TCTTCCCTTC 120 

CTTTCTCGCC ACGTTCGCCG GCTTTCCCCG TCAAGCTCTA AATCGGGGGC TCCCTTTAGG 180 

GTTCCGATTT AGTGCTTTAC GGCACCTCGA CCCCAAAAAA CTTGATTAGG GTGATGGTTC 240 

ACGTAGTGGG CCATCGCCCT GATAGACGGT TTTTCGCCCT TTGACGTTGG AGTCCACGTT 300 

CTTTAATAGT GGACTCTTGT TCCAAACTGG AACAACACTC AACCCTATCT CGGTCTATTC 360 

TTTTGATTTA TAAGGGATTT TGCCGATTTC GGCCTATTGG TTAAAAAATG AGCTGATTTA 420 

ACAAAAATTT AACGCGAATT TTAACAAAAT ATTAACGTTT ACAATTTCAG GTGGCACTTT 480 

TCGGGGAAAT GTGCGCGGAA CCCCTATTTG TTTATTTTTC TAAATACATT CAAATATGTA 540 

TCCGCTCATG AATTAATTCT TAGAAAAACT C7VTCGAGCAT CAAATGAAAC TGCAATTTAT 600 

TCATATCAGG ATTATCAATA CCATATTTTT GAAAZIAGCCG TTTCTGTAAT GAAGGAGAAA 660 

ACTCACCGAG GCAGTTCCAT AGGATGGCAA GATCCTGGTA TCGGTCTGCG ATTCCGACTC 720 

GTCCAACATC AATACAACCT ATTAATTTCC CCTCGTCAAA AATAAGGTTA TCAAGTGAGA 780 

AATCACCATG AGTGACGACT GAATCCGGTG AGAATGGCAA AAGTTTATGC ATTTCTTTCC 840 

AGACTTGTTC AACAGGCCAG CCATTACGCT CGTCATCAAA ATCACTCGCA TCAACCAAAC 900 

CGTTATTCAT TCGTGATTGC GCCTGAGCGA GACGAAATAC GCGATCGCTG TTAAAAGGAC 960 

AATTACAAAC AGGAATCGAA TGCAACCGGC GCAGGAACAC TGCCAGCGCA TCAACAATAT 1020 

TTTCACCTGA ATCAGGATAT TCTTCTAATA CCTGGAATGC TGTTTTCCCG GGGATCGCAG 1080 

TGGTGAGTAA CCATGCATCA TCAGGAGTAC GGATAAAATG CTTGATGGTC GGAAGAGGCA 1140 

TAAATTCCGT CAGCCAGTTT AGTCTGACCA TCTCATCTGT AACATCATTG GCAACGCTAC 1200 

CTTTGCCATG TTTCAGAAAC AACTCTGGCG CATCGGGCTT CCCATACAAT CGATAGATTG 1260 

Rg. Is- A 
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TCGCACCTGA TTGCCCGACA TTATCGCGAG 
TGTTGGAATT TAATCGCGGC CTAGAGCAAG 
CCCTTGTATT ACTGTTTATG TAAGCAGACA 
CGTGAGTTTT CGTTCCACTG AGCGTCAGAC 
GATCCTTTTT TTCTGCGCGT AATCTGCTGC 
GTGGTTTGTT TGCCGGATCA AGAGCTACCA 
AGAGCGCAGA TACCAAATAC TGTCCTTCTA 
AACTCTGTAG CACCGCCTAC ATACCTCGCT 
AGTGGCGATA AGTCGTGTCT TACCGGGTTG 
CAGCGGTCGG GCTGAACGGG GGGTTCGTGC 
ACCGAACTGA GATACCTACA GCGTGAGCTA 
AAGGCGGACA GGTATCCGGT AAGCGGCAGG 
CCAGGGGGAA ACGCCTGGTA TCTTTATAGT 
CGTCGATTTT TGTGATGCTC GTCAGGGGGG 
GCCTTTTTAC GGTTCCTGGC CTTTTGCTGG 
TCCCCTGATT CTGTGGATAA CCGTATTACC 
AGCCGAACGA CCGAGCGCAG CGAGTCAGTG 
TATTTTCTCC TTACGCATCT GTGCGGTATT 
CAATCTGCTC TGATGCCGCA TAGTTAAGCC 
GGTCATGGCT GCGCCCCGAC ACCCGCCAAC 
GCTCCCGGCA TCCGCTTACA GACAAGCTGT 
GTTTTCACCG TCATCACCGA AACGCGCGAG 
GTGAAGCGAT TCACAGATGT CTGCCTGTTC 
AAGCGTTAAT GTCTGGCTTC TGATAAAGCG 
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CCCATTTATA CCCATATAAA TCAGCATCCA 1320 

ACGTTTCCCG TTGAATATGG CTCATAACAC 1380 

GTTTTATTGT TCATGACCAA AATCCCTTAA 1440 

CCCGTAGAAA AGATCAAAGG ATCTTCTTGA 1500 

TTGCAAACAA AAAAACCACC GCTACCAGCG 1560 

ACTCTTTTTC CGAAGGTAAC TGGCTTCAGC 1620 

GTGTAGCCGT AGTTAGGCCA CCACTTCAAG 1680 

CTGCTAATCC TGTTACCAGT GGCTGCTGCC 1740 

GACTCAAGAC GATAGTTACC GGATAAGGCG 1800 

ACACAGCCCA GCTTGGAGCG AACGACCTAC 1860 

TGAGAAAGCG CCACGCTTCC CGAAGGGAGA 1920 

GTCGGAACAG GAGAGCGCAC GAGGGAGCTT 1980 

CCTGTCGGGT TTCGCCACCT CTGACTTGAG 2040 

CGGAGCCTAT GGAAAAACGC CAGCAACGCG 2100 

CCTTTTGCTC ACATGTTCTT TCCTGCGTTA 2160 

GCCTTTGAGT GAGCTGATAC CGCTCGCCGC 2220 

AGCGAGGAAG CGGAAGAGCG CCTGATGCGG 2280 

TCACACCGCA TATATGGTGC ACTCTCAGTA 2340 

AGTATACACT CCGCTATCGC TACGTGACTG 2400 

ACCCGCTGAC GCGCCCTGAC GGGCTTGTCT 2460 

GACCGTCTCC GGGAGCTGCA TGTGTCAGAG 2520 

GCAGCTGCGG TAAAGCTCAT CAGCGTGGTC 2580 

ATCCGCGTCC AGCTCGTTGA GTTTCTCCAG 2640 

GGCCATGTTA AGGGCGGTTT TTTCCTGTTT 2700 
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GGTCACTGAT 


GCCTCCGTGT 


AAGGGGGATT 


TCTGTTCATG 


GGGGTAATGA 


TACCGATGAA 


2760 


ACGAGAGAGG 


ATGCTCACGA 


TACGGGTTAC 


TGATGATGAA 


CATGCCCGGT 


TACTGGAACG 


2820 


TTGTGAGGGT 


AAACAACTGG 


CGGTATGGAT 


GCGGCGGGAC 


CAGAGAAAAA 


TCACTCAGGG 


2880 


TCAATGCCAG 


CGCTTCGTTA 


ATACAGATGT 


AGGTGTTCCA 


CAGGGTAGCC 


AGCAGCATCC 


2940 


TGCGATGCAG 


ATCCGGAACA 


TAATGGTGCA 


GGGCGCTGAC 


TTCCGCGTTT 


CCAGACTTTA 


3000 


CGAAACACGG 


AAACCGAAGA 


CCATTCATGT 


TGTTGCTCAG 


GTCGCAGACG 


TTTTGCAGCA 


3060 


GCAGTCGCTT 


CACGTTCGCT 


CGCGTATCGG 


TGATTCATTC 


TGCTAACCAG 


TAAGGCAACC 


3120 


CCGCCAGCCT 


AGCCGGGTCC 


TCAACGACAG 


GAGCACGATC 


ATGCGCACCC 


GTGGGGCCGC 


3180 


CATGCCGGCG 


ATAATGGCCT 


GCTTCTCGCC 


GAAACGTTTG 


GTGGCGGGAC 


CAGTGACGAA 


3240 


GGCTTGAGCG 


AGGGCGTGCA 


AGATTCCGAA 


TACCGCAAGC 


GACAGGCCGA 


TCATCGTCGC 


3300 


GCTCCAGCGA 


AAGCGGTCCT 


CGCCGAAAAT 


GACCCAGAGC 


GCTGCCGGCA 


CCTGTCCTAC 


3360 


GAGTTGCATG 


ATAAAGAAGA 


CAGTCATAAG 


TGCGGCGACG 


ATAGTCTITGC 


CCCGCGCCCA 


3420 


CCGGAAGGAG 


CTGACTGGGT 


TGAAGGCTCT 


CAAGGGCATC 


GGTCGAGATC 


CCGGTGCCTA 


3480 


ATGAGTGAGC 


TAACTTACAT 


TAATTGCGTT 


GCGCTCACTG 


CCCGCTTTCC 


AGTCGGGAAA 


3540 


CCTGTCGTGC 


CT^GCTGCATT 


AATGAATCGG 


CCAACGCGCG 


GGGAGAGGCG 


GTTTGCGTAT 


3600 


TGGGCGCCAG 


GGTGGTTTTT 


CTTTTCACCA 


GTGAGACGGG 


CAACAGCTGA 


TTGCCCTTCA 


3660 


CCGCCTGGCC 


CTGAGAGAGT 


TGCAGCAAGC 


GGTCCACGCT 


GGTTTGCCCC 


AGCAGGCGAA 


3720 


AATCCTGTTT 


GATGGTGGTT 


AACGGCGGGA 


TATAACATGA 


GCTGTCTTCG 


GTATCGTCGT 


3780 


ATCCCACTAC 


CGAGATATCC 


GCACCAACGC 


GCAGCCCGGA 


CTCGGTAATG 


GCGCGCATTG 


3840 


CGCCCAGCGC 


CATCTGATCG 


TTGGCAACCA 


GCATCGCAGT 


GGGAACGATG 


CCCTCATTCA 


3900 


GCATTTGCAT 


GGTTTGTTGA 


AAACCGGACA 


TGGCACTCCA 


GTCGCCTTCC 


CGTTCCGCTA 


3960 


TCGGCTGAAT 


TTGATTGCGA 


GTGAGATATT 


TATGCCAGCC 


AGCCAGACGC 


AGACGCGCCG 


4020 


AGACAGAACT 


TAATGGGCCC 


GCTAACAGCG 


CGATTTGCTG 


GTGACCCAAT 


GCGACCAGAT 


4080 


GCTCCACGCC 


CAGTCGCGTA 


CCGTCTTCAT 


GGGAGAAAAT 


AATACTGTTG 


ATGGGTGTCT 


4140 
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GGTCAGAGAC ATCAAGAAAT AACGCCGGAA CATTAGTGCA GGCAGCTTCC ACAGCAATGG 4200 

CATCCTGGTC ATCCAGCGGA TAGTTAATGA TCAGCCCACT GACGCGTTGC GCGAGAAGAT 4260 

TGTGCACCGC CGCTTTACAG GCTTCGACGC CGCTTCGTTC TACCATCGAC ACCACCACGC 4320 

TGGCACCCAG TTGATCGGCG CGAGATTTAA TCGCCGCGAC AATTTGCGAC GGCGCGTGCA 4380 

GGGCCAGACT GGAGGTGGCA ACGCCAATCA GCAACGACTG TTTGCCCGCC AGTTGTTGTG 4440 

CCACGCGGTT GGGAATGTAA TTCAGCTCCG CCATCGCCGC TTCCACTTTT TCCCGCGTTT 4500 

TCGCAGAAAC GTGGCTGGCC TGGTTCACCA CGCGGGAAAC GGTCTGATAA GAGACACCGG 4560 

CATACTCTGC GACATCGTAT AACGTTACTG GTTTCACATT CACCACCCTG AATTGACTCT 4620 

CTTCCGGGCG CTATCATGCC ATACCGCGAA AGGTTTTGCG CCATTCGATG GTGTCCGGGA 4680 

TCTCGACGCT CTCCCTTATG CGACTCCTGC ATTAGGAAGC AGCCCAGTAG TAGGTTGAGG 4740 

CCGTTGAGCA CCGCCGCCGC AAGGAATGGT GCATGCAAGG AGATGGCGCC CAACAGTCCC 4800 

CCGGCCACGG GGCCTGCCAC CATACCCACG CCGAAACAAG CGCTCATGAG CCCGAAGTGG 4860 

CGAGCCCGAT CTTCCCCATC GGTGATGTCG GCGATATAGG CGCCAGCAAC CGCACCTGTG 4920 

GCGCCGGTGA TGCCGGCCAC GATGCGTCCG GCGTAGAGGA TCGAGATCTC GATCCCGCGA 4980 

AATTAATACG ACTCACTATA GGGGAATTGT GAGCGGATAA CAATTCCCCT CTAGAAATAA 5040 

TTTTGTTTAA CTTTAAGAAG GAGATATACA TATGGGCCAT CATCATCATC ATCACGTGAT 5100 

CGACATCATC GGGACCAGCC CCACATCCTG GGAACAGGCG GCGGCGGAGG CGGTCCAGCG 5160 

GGCGCGGGAT AGCGTCGATG ACATCCGCGT CGCTCGGGTC ATTGAGCAGG ACATGGCCGT 5220 

GGACAGCGCC GGCAAGATCA CCTACCGCAT CAAGCTCGAA GTGTCGTTCA AGATGAGGCC 5280 

GGCGCAACCG AGGGGCTCGA AACCACCGAG CGGTTCGCCT GAAACGGGCG CCGGCGCCGG 5340 

TACTGTCGCG ACTACCCCCG CGTCGTCGCC GGTGACGTTG GCGGAGACCG GTAGCACGCT 5400 

GCTCTACCCG CTGTTCAACC TGTGGGGTCC GGCCTTTCAC GAGAGGTATC CGAACGTCAC 5460 

GATCACCGCT CAGGGCACCG GTTCTGGTGC CGGGATCGCG CAGGCCGCCG CCGGGACGGT 5520 

CAACATTGGG GCCTCCGACG CCTATCTGTC GGAAGGTGAT ATGGCCGCGC ACAAGGGGCT 5580 
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GATGAACATC 


GCGCTAGCCA 


TCTCCGCTCA 


GCAGGTCAAC 


TACAACCTGC 


CCGGAGTGAG 


5640 


CGAGCACCTC 


AAGCTGAACG 


GAAAAGTCCT 


GGCGGCCATG 


TACCAGGGCA 


CCATCAAAAC 


5700 


CTGGGACGAC 


CCGCAGATCG 


CTGCGCTCAA 


CCCCGGCGTG 


AACCTGCCCG 


GCACCGCGGT 


5760 


AGTTCCGCTG 


CACCGCTCCG 


ACGGGTCCGG 


TGACACCTTC 


TTGTTCACCC 


AGTACCTGTC 


5820 


CAAGCAAGAT 


CCCGAGGGCT 


GGGGCAAGTC 


GCCCGGCTTC 


GGCACCACCG 


TCGACTTCCC 


5880 


GGCGGTGCCG 


GGTGCGCTGG 


GTGAGAACGG 


CAACGGCGGC 


ATGGTGACCG 


GTTGCGCCGA 


5940 


GACACCGGGC 


TGCGTGGCCT 


ATATCGGCAT 


CAGCTTCCTC 


GACCAGGCCA 


GTCAACGGGG 


6000 


ACTCGGCGAG 


GCCCAACTAG 


GCAATAGCTC 


TGGCAATTTC 


TTGTTGCCCG 


ACGCGCAAAG 


6060 


CATTCAGGCC 


GCGGCGGCTG 


GCTTCGCATC 


GAAAACCCCG 


GCGAACCAGG 


CGATTTCGAT 


6120 


GATCGACGGG 


CCCGCCCCGG 


ACGGCTACCC 


GATCATCAAC 


TACGAGTACG 


CCATCGTCAA 


6180 


CAACCGGCAA 


AAGGACGCCG 


CCACCGCGCA 


GACCTTGCAG 


GCATTTCTGC 


ACTGGGCGAT 


6240 


CACCGACGGC 


AACAAGGCCT 


CGTTCCTCGA 


CCAGGTTCAT 


TTCCAGCCGC 


TGCCGCCCGC 


6300 


GGTGGTGAAG 


TTGTCTGACG 


CGTTGATCGC 


GACGATTTCC 


AGCGCTGAGA 


TGAAGACCGA 


6360 


TGCCGCTACC 


CTCGCGCAGG 


AGGCAGGTAA 


TTTCGAGCGG 


ATCTCCGGCG 


ACCTGAAAAC 


6420 


CCAGATCGAC 


CAGGTGGAGT 


CGACGGCAGG 


TTCGTTGCAG 


GGCCAGTGGC 


GCGGCGCGGC 


6480 


GGGGACGGCC 


GCCCAGGCCG 


CGGTGGTGCG 


CTTCCAAGAA 


GCAGCCAATA AGCAGAAGCA 


6540 


GGAACTCGAC 


GAGATCTCGA 


CGAATATTCG 


TCAGGCCGGC 


GTCCAATACT 


CGAGGGCCGA 


6600 


CGAGGAGCAG 


CAGCAGGCGC 


TGTCCTCGCA 


AATGGGCTTT 


GTGCCCACAA 


CGGCCGCCTC 


6660 


GCCGCCGTCG 


ACCGCTGCAG 


CGCCACCCGC 


ACCGGCGACA 


CCTGTTGCCC 


CCCCACCACC 


6720 


GGCCGCCGCC 


AACACGCCGA 


ATGCCCAGCC 


GGGCGATCCC 


AACGCAGCAC 


CTCGGCCGGC 


6780 


CGACCCGAAC 


GCACCGCCGC 


CACCTGTCAT 


TGCCCCAAAC 


GCACCCCAAC 


CTGTCCGGAT 


6840 


CGACAACCCG 


GTTGGAGGAT 


TCAGCTTCGC 


GCTGCCTGCT 


GGCTGGGTGG 


AGTCTGACGC 


6900 


CGCCCACTTC 


GACTACGGTT 


CAGCACTCCT 


CAGCAAAACC 


ACCGGGGACC 


CGCCATTTCC 


6960 


CGGACAGCCG 


CCGCCGGTGG 


CCAATGACAC 


CCGTATCGTG 


CTCGGCCGGC 


TAGACCAAAA 


7020 
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GCTTTACGCC AGCGCCGAAG CCACCGACTC CAAGGCCGCG GCCCGGTTGG GCTCGGACAT 
GGGTGAGTTC TATATGCCCT ACCCGGGCAC CCGGATCAAC CAGGAAACCG TCTCGCTTGA 
CGCCAACGGG GTGTCTGGAA GCGCGTCGTA TTACGAAGTC AAGTTCAGCG ATCCGAGTAA 
GCCGAACGGC CAGATCTGGA CGGGCGTAAT CGGCTCGCCC GCGGCGAACG CACCGGACGC 
CGGGCCCCCT CAGCGCTGGT TTGTGGTATG GCTCGGGACC GCCAACAACC CGGTGGACAA 
GGGCGCGGCC AAGGCGCTGG CCGAATCGAT CCGGCCTTTG GTCGCCCCGC CGCCGGCGCC 
GGCACCGGCT CCTGCAGAGC CCGCTCCGGC GCCGGCGCCG GCCGGGGAAG TCGCTCCTAC 
CCCGACGACA CCGACACCGC AGCGGACCTT ACCGGCCTGA GAATTCTGCA GATATCCATC 
ACACTGGCGG CCGCTCGAGC ACCACCACCA CCACCACTGA GATCCGGCTG CTAACAAAGC 
CCGAAAGGAA GCTGAGTTGG CTGCTGCCAC CGCTGAGCAA TAACTAGCAT AACCCCTTGG 
GGCCTCTAAA CGGGTCTTGA GGGGTTTTTT GCTGAAAGGA GGAACTATAT CCGGAT 



7080 
7140 
7200 
7260 
7320 
7380 
7440 
7500 
7560 
7620 
7676 
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Met Gly His His His His His His Val He Asp He He Gly Thr Ser 
15 10 15 

Pro Thr Ser Trp Glu Gin Ala Ala Ala Glu Ala Val Gin Arg Ala Arg 
20 25 30 

Asp Ser Val Asp Asp He Arg Val Ala Arg Val He Glu Gin Asp Met 
35 40 45 

Ala Val Asp Ser Ala Gly Lys He Thr Tyr Arg He Lys Leu Glu Val 
50 55 60 

Ser Phe Lys Met Arg Pro Ala Gin Pro Arg Gly Ser Lys Pro Pro Ser 
65 70 75 80 

Gly Ser Pro Glu Thr Gly Ala Gly Ala Gly Thr Val Ala Thr Thr Pro 
85 90 95 

Ala Ser Ser Pro Val Thr Leu Ala Glu Thr Gly Ser Thr Leu Leu Tyr 
100 105 110 

Pro Leu Phe Asn Leu Trp Gly Pro Ala Phe His Glu Arg Tyr Pro Asn 
115 120 125 

Val Thr He Thr Ala Gin Gly Thr Gly Ser Gly Ala Gly He Ala Gin 
130 135 140 

Ala Ala Ala Gly Thr Val Asn He Gly Ala Ser Asp Ala Tyr Leu Ser 
145 150 155 160 

Glu Gly Asp Met Ala Ala His Lys Gly Leu Met Asn He Ala Leu Ala 
165 170 175 

He Ser Ala Gin Gin Val Asn Tyr Asn Leu Pro Gly Val Ser Glu His 
180 185 190 

Leu Lys Leu Asn Gly Lys Val Leu Ala Ala Met Tyr Gin Gly Thr He 
195 200 205 

Lys Thr Trp Asp Asp Pro Gin He Ala Ala Leu Asn Pro Gly Val Asn 
210 215 220 
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Leu Pro Gly Thr Ala Val Val Pro Leu His Arg Ser Asp Gly Ser Gly 
225 230 235 240 

Asp Thr Phe Leu Phe Thr Gin Tyr Leu Ser Lys Gin Asp Pro Glu Gly 
245 250 255 

Trp Gly Lys Ser Pro Gly Phe Gly Thr Thr Val Asp Phe Pro Ala Val 
260 265 270 

Pro Gly Ala Leu Gly Glu Asn Gly Asn Gly Gly Met Val Thr Gly Cys 
275 280 285 

Ala Glu Thr Pro Gly Cys Val Ala Tyr lie Gly lie Ser Phe Leu Asp 
290 295 300 

Gin Ala Ser Gin Arg Gly Leu Gly Glu Ala Gin Leu Gly Asn Ser Ser 
305 310 315 320 

Gly Asn Phe Leu Leu Pro Asp Ala Gin Ser lie Gin Ala Ala Ala Ala 
325 330 335 

Gly Phe Ala Ser Lys Thr Pro Ala Asn Gin Ala He Ser Met He Asp 
340 345 350 

Gly Pro Ala Pro Asp Gly Tyr Pro He He Asn Tyr Glu Tyr Ala He 
355 360 365 

Val Asn Asn Arg Gin Lys Asp Ala Ala Thr Ala Gin Thr Leu Gin Ala 
370 375 380 

Phe Leu His Trp Ala He Thr Asp Gly Asn Lys Ala Ser Phe Leu Asp 
385 390 395 400 

Gin Val His Phe Gin Pro Leu Pro Pro Ala Val Val Lys Leu Ser Asp 
405 410 415 

Ala Leu He Ala Thr He Ser Ser Ala Glu Met Lys Thr Asp Ala Ala 
420 425 430 

Thr Leu Ala Gin Glu Ala Gly Asn Phe Glu Arg He Ser Gly Asp Leu 
435 440 445 

Lys Thr Gin He Asp Gin Val Glu Ser Thr Ala Gly Ser Leu Gin Gly 
450 455 460 

Gin Trp Arg Gly Ala Ala Gly Thr Ala Ala Gin Ala Ala Val Val Arg 
465 470 475 480 
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Phe Gin Glu Ala Ala Asn Lys Gin Lys Gin Glu Leu Asp Glu lie Ser 
485 490 495 

Thr Asn lie Arg Gin Ala Gly Val Gin Tyr Ser Arg Ala Asp Glu Glu 
500 505 510 

Gin Gin Gin Ala Leu Ser Ser Gin Met Gly Phe Val Pro Thr Thr Ala 
515 520 525 

Ala Ser Pro Pro Ser Thr Ala Ala Ala Pro Pro Ala Pro Ala Thr Pro 
530 535 ' 540 

Val Ala Pro Pro Pro Pro Ala Ala Ala Asn Thr Pro Asn Ala Gin Pro 
545 550 555 560 

Gly Asp Pro Asn Ala Ala Pro Pro Pro Ala Asp Pro Asn Ala Pro Pro 
565 570 575 

Pro Pro Val lie Ala Pro Asn Ala Pro Gin Pro Val Arg lie Asp Asn 
580 585 590 

Pro Val Gly Gly Phe Ser Phe Ala Leu Pro Ala Gly Trp Val Glu Ser 
595 600 605 

Asp Ala Ala His Phe Asp Tyr Gly Ser Ala Leu Leu Ser Lys Thr Thr 
610 615 620 

Gly Asp Pro Pro Phe Pro Gly Gin Pro Pro Pro Val Ala Asn Asp Thr 
625 630 635 640 

Arg lie Val Leu Gly Arg Leu Asp Gin Lys Leu Tyr Ala Ser Ala Glu 
645 650 655 

Ala Thr Asp Ser Lys Ala Ala Ala Arg Leu Gly Ser Asp Met Gly Glu 
660 665 670 

Phe Tyr Met Pro Tyr Pro Gly Thr Arg He Asn Gin Glu Thr Val Ser 
675 680 685 

Leu Asp Ala Asn Gly Val Ser Gly Ser Ala Ser Tyr Tyr Glu Val Lys 
690 695 700 

Phe Ser Asp Pro Ser Lys Pro Asn Gly Gin lie Trp Thr Gly Val He 
705 710 715 720 

Gly Ser Pro Ala Ala Asn Ala Pro Asp Ala Gly Pro Pro Gin Arg Trp 
725 730 735 
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Phe Val Val Trp Leu Gly Thr Ala 
740 

Ala Lys Ala Leu Ala GluSer lie 
755 760 

Ala Pro Ala Pro Ala Pro Ala Glu 
770 775 

Gly Glu Val Ala Pro Thr Pro Thr 
785 790 



Asn Asn Pro Val Asp Lys Gly Ala 
745 750 

Arg Pro Leu Val Ala Pro Pro Pro 
765 

Pro Ala Pro Ala Pro Ala Pro Ala 
780 

Thr Pro Thr Pro Gin Arg Thr Leu 
795 800 



Pro Ala 
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CAtATG CATCACCAKACCATCACATCGCCACCACCCTTCCCGTTCAGCGCCACCCGCGGTCCCTCTTCCC^^ 

CTATACGTAGTGCTAGTGCTAGTGTACCCGTGCTGGCAAGGGCAAGTCGCCGTGGGCGCCAGGCACAAGGGGCTCAAAAGACTCGACAACCGCCCGAACGGCAGIAAGCCCCCTCACCCC 



Ymmm Mel / HIS TAG i ll i^— — 6fd 1 4 — — 

HMHHHHHHMATTLPVORHPRSLFPEfSELFAAFPSFAGLll 

CCCA CCTTCGACACCCGGTTGATGCGCCTGCAAGACGAGATGAAAGAGGCCCGCTACGAGGTACGCGCGGACCTTCCCGCGGTCCACCCCGAC 

CCGTCCAAGCTGTGGCCCAACTACGCCGACCTTCTGCTCTACTTTCTCCCCGCGATCCTCCATGCGCGCCTCGAAGCGCCCCAGCTGCGCCTGTTCCTGCAGCTGTAATACCACCCCCrA 



■6rd14" 



PTFDTRLMRLEDEttKEGRYEVRAELPGVOPOKOVOIMVRO 
GGTCAGCTGACCATCAACGCCGAGCGC^CCGACCAGAAGGACTTCGACGGTCGCTCGGAATTCGCGTACGGTTCCTTCGTTCGCACGCT^ 

CCAGTCGACTGGTAGTTCCGGCTCCCGTGCCTCGTCTTCCTGAAGCTGCCAGCGAGCCTTAAGCGCATGCCAACGAAGCAAGCGTGCCACACCCACCGCCATCCACCACTGCTCCTGCTG 



■Erd14« 



COLT JKAERTEQKDFOGRSEFAYGSFVRTVSLPVGAOEOD 
ATTAAGGCCACCTACGACAAGGGCATTCTTACTGTGTCGGTGGCCCTTTCGGAACGGAAGCCAACCGAAA^ 

TAATTCCCGrGGATGCTGTTCCCGTAAGAATGACACAGCCACCGCCAAAGCCTTCCCTTCGGTTGGCTTTTCGTGTAAGTCTAGGCCAGGTGGTTGTTCGAACTAGGGCACCTGCGCCAG 



>Erd 14 ■ 



IXATYOKCILTVSVAVSEGKPTEKHIQIRSTNKLOP 
ATTAACACCACCTGCAATTACGGCCAGCTAGTAGCTGCGCTCAACCCGACGGATCCGGCGGCTGCCGCACACTTCAACGCCTCACC GGTGC CGCAGTCCTATTTGCCCAATTTCCTCGCC 

I I I I I ( [ ) I ■ t ■ 1 I ■ ' ' ' " '~* 4 ■ I I ■ I ■ ■ ' ■ I I ■ I I I I'll II ■ I I 5Q0 

TAATTGTGGTGGACGTTAATGCCCGTCCATCATCGACGCCAGTTGCGCTGCCTAGGCCCCCGACGGCGTGTCAACTTCCGGACTGCCCACCGCGTCAGCATAAACGCGTTAAAGGACCGG 



■OPV" 



INTTCNYGOVVAALNATDPGAAAOFNASPVAQSVtRNFLA 

- ilCACCGCCACCTCACCGCGCTGCCATGCCCGCGCAATTCCAAGCTGTGCCGGGGGCGGCACAGTACATCGGCCrTGTCG^^ 
CCTGGCGGTGCAGTCGCGCGACGGTACCCCCGCGTTAACCTTCGACACCCCCCCCGCCGTCTCATGTAGCCGGAACAGCTCAGCCAACCCCCGA6CACGTTGTTGATACTCGAGTACTGC 



. OPV l l Sad § MTI ■ 



APPPORAAMAAQLOAVPGAAOYICLVESVAGSCNNYECMT 
ATTAATTACCACTTCGGGGACGTCGACGCTCATGGCGCCATGATCCGCCCTCAGGCGCCGTCGCTTGAGGCGGAGCATCAGGCCA^ 

rAATTAATCGTCAAGCCCCTGCAGCTGCGAGTACCGCGGTACIACGCGCGACTCCGCCGCAGCGAACTCCGCCTCGTAGTCCGCTACCAACCACTACACAACCGGCCCCCACTGAAAACC 



"MTI« 



INYQFGOVOAHGAMIRAQAASLEAEHQAIVROVLAAGOFW 
GGCCGCGCCGGTTCCCTGCCTTGCCAGGACTTCATTACCCAGTTCGGCCGTAACTTCCACGTGATCTACCAGCACCCCAACGCCCACG 

CCGCCGCGGCCAAGCCACCGAACCGTCCTCAAGTAATGGGTCAACCCCCCATTCAAGGTCCACTAGATGCrCGTCCCGTTCCGGGTGCCCGTCTTCCACGTCCGACGGCCGTTGTTGTAC 



CCAGSVACQEF I TQLGRNFOVIYEOANAHGOKVOAAGNNM 

GCGCAAACCGACACCGCCGTCGCCTCCAGCTGGGCCACTAGTATGAGCCTTTTGGATGCTCATATCCCACAGTTGGTCGCCTCCC AGTCCGCGTTTGCCCCCAAGGCGGGGCTGAT^^ 

I I 1 1 1 1 I I I I I 1 1 1 1 III ■ ' I ' ■ ' I ' 1 • t ■ ■ [■-■■'■ . ^ ..... i - ~- 1 1 I ■ ■ ■!■ t ■ ■ ■III I ■ ■ « iMit ■ ■ ■ ■ I ■ ■ I iQQo 

CCCGTTTCCCTGTCGCGGCAGCCCAGGTCGACCCGGTGATCATACTCGGAAAACCTACGAGTATAGGGTGTCAACCACCCGAGCGTCAGCCGCAAACGGCGGTTCCGCCCCGACTACGCC 



MTI I I fSpe ni MSL" 



AOTOSAVGSSWATSnSLLOAHIPOLVASQSAFAAKAGLnR 
CACACGATCGGTCAG GCCCAGCACGCCGCGATGTCGGCTCACGCGTTTCAC CAGGGGGAGTCGTCGGCGGCGTTTCAGGCCGCCCATGCCCGGTTTGTGCCGGCGCCCGCCAAAG^ 
CTCTGCTACCCAGTCCCGCTCGTCCCCCCCTACACCCGACTCCGCAAACrGGTCCCCCTCAGCAGCCGCCGCAAAGTCCCGCGGCTACGGCCCAAACACCGCCGCCGGCGGTTTCAGTTG 



1200 



MSL — 

HTIGOAEQAAMSAOAFHQCESSAAFOAAHARFVAAAAKVN 

ACCTTCTTCCATCTCGCCCAGGCCAATCTGGCTCACCCCCCCGGTACCTATCTCCCCGCCGATGCTGCGGCCCCGTCGACCTAT^^ ^^^^ 
TCGAACAACCTACAGCGCGTCCGCTTACACCCACTCCCCCCCCCArCGATACACCCCCCCCTACCACGCCCCCGCAGCTGGATATGCCCCAAGCTATAGTACCTAAACCCCGAAAATGGA 



— "^i— ■ MSL — — 

AQANLGEAAGfYVAAOAAAASTYTC 
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LCCCAACTJAATTCAACCC CAATCTATTCCCCTCCCCCGCCCGAGTCCATGCTACCCCCCCCCGCCGCCrGGCACCCTCTCCCCCCCCAGTTCACTTCCCCCCC^^ 
CGCCTTCACTTAACTTCGCCTTACATAACGCCACCCCCCCCCCTCAGCTACCATCGCCCCCCCCGGCGCACCCTCCCACACCGGCGCCTCAACTGAACCCCGCCCCACAGCA^ACCIACC 



•mTCC2" 



PCVNSSRMrSGPGPESnLAAAAAWOCVAAELTSAAVSYCS 
CTCGTCTCGAC GCTGATCGTTGAGCCGTCGATCGCGCCGGCGGCGCCCCCGATGCCCGCCCCCCCAACGCCCTATGTCGCGTGGCTGGCCGCCACGCCCC^^^ 
CACCACAGCTGCGACTACCAACTCGGCACCTACCCCCGCCGCCGCCCCCGCTACCGCCGGCCCCGTTGCGGCATACACCCCACCCACCCGCCCTGCW^^ 



■mTCC2" 



VVSTLlvePWK CPAAAAMAAAATPYVGWLAATAALAKETA 
ACACACGCGACCGCAGCGGCCCAACCCTTTGGCACGCCCTTCGCGATCACGGTGCCACCATCCCTCCTCGCGCCCAACCCCACCCGGTTGATGTCCCTCGTCCCGGCGAA^ 
T^TCTCCCCTCCCGTCCCCGCC^ 



• mTCC2* 



TQARAAAEAFGTAFAMTVPPSLVAANRSRLnSLVAANlLG 
CAAAACACTGCGCC GATCGCGCCTACCCAGCCCGACTATGCCGAAATCTGGGCCCAAGACGCTCCCGTGyGTACAGCTATCAGCGGCCATCTGCCCCCG^ 

CTTTTGTCACGCCCCTAGCGCCGATCCGTCCCGCTCATACGGCTTTACACCCGGCTTCTCCGACGGCACIACATGTCGATACTCCCCCGTAGACGCCCCCGCAGCCCCAACCGCCGCAAG 



■ mTCC2" 



QNSAA I AATOAEYAEMWAQDAAVMYSYECASAAASALPPF 
ACTCCACCCCTGCAAGGCACCCCCCCCCCCCGGCCCGCGGCCGCACCCCCCCCGACCCAACCCCCCGGTGCCCCCGCCGTTGCGGATCCACAGGC^ ,^20 
TGACCTCCGCACGTTCCGTGGCCGGGCCCGCCCGGGCGCCGGCCTCGCCGCCGCTGCGTTCGCCCGCCACGCCCCCCGCAACCCCTACGTCTCCCCTGTCACCCGCTCCACGCCCGCCCC 



■mTCC2" 



TPPVQGTGPAGPAAAAAATQAAGACAVAOAOATLAOLPPG 

ATCCTGAGCGACATTCTGTCCGCATTGGCCCCCAACCCTCATCCGCTGACATCGGCACTCTTGGCGArCGCCTCGACCCTCAACCCGCAAGTCGGATCC^^ 2040 
TACCACTCCCTGTAAGACAGCCCTAACCGGCCGTTGCGACTAGGCCACTGTACCCCTGACAACCCCTACCGCAGCTGCGACTTCCGCCTTCAGCCTAGCCGAGTCCGCTATCACTAGGGG 



■mTCC2« 



(LSOILSALAANAOPLTSGLLGIAST LNPO VGSAQPIVIP 

A CCCCCATACGCCAATTGGACGTCATCCCGCTCTACATTGCArCCATCCCGACCCGCACCATrCCCCTCCCCATCACCAACACCCCCACACCCTCCCACA^ 2160 
TGCCGCTATCCCCTTAACCTGCACTACCCCGAGATGTAACGTAGGTACCGCTGGCCGTCGTAACCCCACCCCTACTGCTTCTCCCCGTCTGCGACCGTGTAGCCGGATATGCCCTTGCCC 



■mTCC2" 



TPICELOVIALYIASIATCSIALAITNTARPWHIGLYGNA 

GGCGGGCTCGCACCGACGCACGCCCATCCACTGACTTCGGCGACCGACGAGCCCCACCCGCACTGCCGCCCCTTCGGGCCCCCGCCCCCGC ^280 
CCGCCCCACCCTGGCTGCGTCCCGGTAGCTGACTCAACCCGCTCGCTGCTCGGCCTCGCCGTGACCCCGCGCAACCCCCCCCCCCGCGGCCACACCCGCCCGCACCCCCTGCGTCGT^^^ 



•mTCC2" 



GCLGPTQGHPLSSATOEPEPHWGPFGGAAPVSACVGHAAL 
CTCGCAGCGTTGTCGGTGCCCCACAGCrCCACCACGGCCCCCCCGGAGATCCAGCTCCCCGTTCAGCCAACACCCACCrTCACCTCCACCCCCGGCCCCCACCCGACGGC^ 
CAGCCTCCCAACAGCCACGGCGTGTCCACCTCGTGCCCCCCCGGCCTCTAGCTCCACCGCCAAGTCCGTTGTGCCTCCAACTCGAGCTCGCCCCCCCGGCTGCCCTCCCGGGATTTGCCC 



■ mTCC2" 



VCALSVPHSWTTAAPEtOLAVOATPT rSSSAGAOPTALNG 

ATCCCCCCACCCCTCCTCAGCCCCATCCCTTTGGCGACCCTGGCCGCACGCGGCACGACGGCCCCTGGCCCCACCCCTACCCGCACCAGCACTGACCGCCAAC ^^^^ 
TACCCCCCTCCGGACCACTCCCCCTACCCAAACCGCTCGGACCCGCGTGCCCCGTGCTCCCCCCCACCCCCCTCCCCATCGCCCTGCTCGTCACTGCCGCTTCTCCTCCCCCCGT 



■ mTCC2" 



«PACLLSCHALASLAARCTTCCGGTR SCTS TOGOEOGRlCP 

CCGCTAGTTCTGATTAGAGAGCAGCCCCCCCCCCC AA ACCCCCCCCGGrAAGATATC 
CGCCAICAACACTAATCTCTCGTCCCCCCCCCCCCTTTCCCCCGCCCCArTCTATAG 



2577 



■—>——— mTCC2 I I RV"] 

PVVVIREQPPPCNPPR 01 
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CATATGCA-^CACCATCACCATCACATGGCCACCACCCTTCCCGTTCAGCGCCACCCGCGGTCCCTCTTCCCCGAGTTTTCTGAGCTGTTCGCGGCCTTCC 

I , I ■ I I , . ■ , ■ 1 ■ 1 1 ' 1 ' H— t ' —4 1 1 1 H 10 

GTATACGTA6TGGTAGT6GTAGTGTACCGGTGGTGGGAAGGGCAAGTCGCGGTGGGCGCCAGGGAGAAGGGGCTCAAAAGACTCGACAAGCGCCGGAAGG 

HMHHHHH HMATTLPVORHPRSL FPEFSELFAAF 
[ , I , I ^ 1 ' ' ' ■ ■ ■ ' 1 ■ ■ 

CGTCATTCGCCGGACTCCGGCCCACCTTCGACACCCGGTTGATGCGGCTGGAAGACGAGATGAAAGAGGGGCGCTACGAGGTACGCGCGGAGCTTCCCGG 

I ■ ■ , I . 1 1 1 ' ' ■ ! ^ 1 ' 1 ' 1 ' 1 -H h 20 

GCAGTAAGCGGCCTGAGGCCGGGTGGAAGCTGTGGGCCAACTAC6CCGACCTTCTGCTCTACTTTCTCCCCGCGAT6CTCCATGCGCGCCTCGAAGGGCC 

PiSFAGLRPTFOTRLrtRLEOEHKEGRYEVRAELPG 
- . 1 ■ . ' ' ' ' ' --^ ' ' ' * * 

6GTCGACCCCGACAAGGACGTCGACATTATGGTCCGCGATGGTCAGCTGACCATCAAGGCCGAGCGCACCGAGCAGAAGGACTTCGACGGTCGCTCGGAA 

1 i ! 1 1 1 1 ■ ' I ' ' ! ' H 1 1 ■ ■ ' 1 ' ' I 30 

CCAGCTGGGGCTGTTCCTGCAGCTGTAATACCAGGCGCTACCAGTCGACTGGTAGTTCCGGCTCGCGTGGCTCGTCTTCCTGAAGCTGCCAGCGAGCCTT 

VDPDKOVOIMVRDGOLTIKAERTEQKDFOGRSE 
— ^ 1 . 1 > — . " — ■ ^— ' ' ' ' ' ' ' ' ■ 

TTCGCGTACGGTTCCTTCGTTCGCACGGTGTCGCTGCCGGTA6GTGCTGACGAGGACGACATTAAGGCCACCTACGACAAGGGCATTCTTACTGTGTCGG 

, . ■ I I I . . . t 1 1 1 1 i 1 i ' \ 1 ! 1- aO 

AAGCGCATGCCAAGGAAGCAAGCGTGCCACAGCGACGGCCATCCACGACTGCTCCTGCTGTAATTCCGGTGGATGCTGTTCCCGTAAGAATGACACAGCC 

FAYGSFVRTV5LPVGA0E0DIKATYDKG ILTVS 



TGGCGGTTTCGGAAGGGAAGCCAACCGAAAAGCACATTCAGATCCGGTCCACCAACAAGCTTGATCCCGTGGACGCGGTCATTAACACCACCTGCAATTA 

, 1 1 1 1 1 1 ■ I t 1 1 1 • 1 ' 1 ' 1 ' 1- 5C 

ACCGCCAAAGCCTTCCCTTCGGTTGGCTTTTCGTGTAAGTCTAGGCCAGGTGGTTGTTCGAACTAGGGCACCTGCGCCAGTAATTGTGGTGGACGTTAAT 

VAVSEGKPTEKHIQIRSTNKLDPVOAV 1 NTTCNY 
, , , ■ ■ 1 ' 1 ' < « 1 »- ' ' 

CGGGCAGGTAGTAGCTGCGCTCAACGCGACGGATCCGGGGGCTGCCGCACAGTTCAACGCCTCACCGGTGGCGCAGTCCTATTTGCGCAATTTCCTCGCC 

, 1 , 1 , 1 . . ■ 1 ^ 1 ■ ' ' ' 1 I ' 1 ^ 1 ■ " 6( 

GCCCGTCCATCATCGACGCGAGTTGCGCTGCCTAGGCCCCCGACGGCGTGfCAAGTTGCGGAGTGGCCACCGCGTCAGGATAAACGCGTTAAAGGAGCGG 

GQVVAALNATOPGAAAQFNAS'PVAOSYLRNFLA 
. . . . 1 < t I 1 

GCACCGCCACCTCAGCGCGCTGCCATGGCCGCGCAATTGCAAGCTGTGCCGGGGGCGGCACAGTACATCGGCCTTGTCGAGTCGGTTGCCGGCTCCTGCA 

, 1 , 1 , 1 . . . I < 1 « i ' ^ ^ » ' 1 ' ^ 7t 

CGTGGCGGTGGAGTCGCGCGACGGTACCGGCGCGTTAACGTTCGACACGGCCCCCGCCGTGTCATGTAGCCGGAACAGCTCAGCCAACGGCCGA66ACGT 

APPPORAAhAAOUQAVPGAAQY IGLVESVAGSC 

■ ■ ■ ■ ■ - ■ ■ ■ ■ I . ■ ■ t . , , — J- 

ACAACTATGAGCTCATGACGATTAATTACCAGTTCGGGGACGTCGACGCTCATGGCGCCATGATCCGCGCTCAGGCGGCGTCGCTTGAGGCGGAGCATCA 

I , , 1 I I I I > ■ I 1 1 1 1 I 1 ■ ' I i 8 

TGTTGATACTCGAGTACTGCTAATTAATGGTCAAGCCCCTGCAGCTGCGAGTACCGCGGTACTAGGCGCGAGTCCGCCGCAGCGAACTCCGCCTCGTAGT 

NNYELMT I nYOFGDVOAHGAMIRAQAASLEAEHO 

■ ■ . ■ ' ■ ■ ■ 1 I I ■ . I — I ' 

GGCCATCGTTCGTGATGTGTTGGCCGCGGGTGACTTTTGGGGCGGCGCCGGTTCGGTGGCTTGCCAGGAGTTCATTACCCAGTTGGGCCGTAACTTCCAG 

-~H ! ! 1 1 , 1 1 ^ 1 ' ' ' 1 ' » ' i * ^ 

CCGGTAGCAAGCACTACACAACCGGCGCCCACTGAAAACCCCGCCGCGGCCAAGCCACCGAACGGTCCTCAAGTAATGGGTCAACCCGGCATTGAAGGTC 

A i VRDvlAAGOFWGGAGSVACQEF I TQLGRNFQ 
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GTGATCTACGAGCAGGCCAACGCCCACGGGCAGAAGGTGCAGGCTGCCGGCAACAACATGGCGCAAACCGACAGCGCCGTCGGCTCCAGCTGGGCCACTA 

I ■ , H ^ 1 ^ 1 ' -t- ' ' ' ' ' ' ' ^ K 

CACTAGATGCTCGTCCGGTTGCGGGTGCCCGTCTTCCACGTCCGACGGCCGTTGTTGTACCGCGTTTGGCTGTCGCGGCAGCCGAGGTCGAtCCG&'TGAT 

Vi YEQANAHGQKVQAAGNNMAOTDSAVGSSWAT 



GTATGAGCCTTTTGGATGCTCATATCCCACAGTTGGTGGCCTCCCAGTCGGCGTTTGCCGCCAAGGCGGGGCTGATGCGGCACACGATCGGTCAGGCC6A 

1 I I I 1 1 1 I I I ' 1 ' ' ' ' ' 1 ' ■ ■ ' ' ' ■ ^ \ ' 

CATACTCGGAAAACCTACGAGTATAGGGTGTCAACCACCGGAGGGTCAGCCGCAAAC6GCGGTTCCGCCCCGACTACGCCGTGTGCTAGCCAGTCCGGCT 

SliS LLOAH I POLVASQSAFAAK AGLMRHT I G QAE 

■ - ■ 1 1. r I ■ ■ ■ ' ' * ' • ■ ' 

GCA6GCGGCGATGTCGGCTCAGGCGTTTCACCAGGGGGAGTCGTCGGC6GCGTTTCAGGCCGCCCATGCCCGGTTTGTGGCGGCGGCCGCCAAAGTCAAC 

, 1 , ) 1 1 I ' I I ■ ' ■ 1 ' ' " ■ ' I ■ ■ > li 

CGTCCGCCGCTACAGCCGAGTCCGCAAAGTGGTCCCCCTCAGCAGCCGCCGCAAAGTCCGGCGGGTACGGGCCAAACACCGCCGCC6GCGGTTTCA6TTG 

QAAM5AQAFHQGESSAAF0AAHARFVAAAAKVN 

■ . _: 1 1 . 1 . ' ' ' 

ACCTTGTTGGATGTCGCGCAGGCGAATCTGGGTGAGGCCGCC6GTACCTATGTGGCCGCCGATGCTGCGGCCGCGTCGACCTATACCGGGTTCGATATC 

■ It t . I 1 1 ^ 1 \ *^ 1 ' ' ' ' ' ' ' ' ^ ^2* 

TGGAACAACCTACAGCGCGTCCGCTTAGACCCACTCCGGCGGCCATGGATACACCGGCGGCTACGACGCCGGCGCAGCTGGATATGGCCCAAGCTATAG 

TLLOVAOANLGEAAGTYVAADAAAA STYTGFDI 
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CATATCCATCACCATCACC ATCACCATCCCCTGGAC GCGGTCATTAACACCACCT G CAUTACGCGCAGCTAGTAGCTGCCCTCAACCCGACGCATCCCC 
nTATACGTA^TGGTACTGGUCTGCTAGGGCACCTGCGCCAGTAATTGTGGTCGACGtlAATGC CCGTCCATCATCGACCCGACTTCCGCTGCCTACGCC 

Met /HIS TAG I |1 «— — * t n d 

V7h H H H H H 0 P V 0 a V I n t t c n Y G 0 V V a a l n a T 0 p 

.n.rT.rr.r.r.nTTrAACncCT CACCCGTGGCGCAGTCCTATTTGCGCAATTTC CTCGCCGCACCGCCACCTCA^^^^^^ 
rrrr.;r.nrGTGTCAAGTTGCGGAGTGGCCACCGCGTCAGGATAAACGCGTTAAAGGAGCGGCGTGG CGGTGGAGTCGCGCGACGGTACC^ 

nr ~ 



GAAAOFNASPVAQSYLRNFLA 



APPPORAAMAAQL 



or..orTnTnrrnr.r.r.GcacCACAGTAC AKGCCCTTGTCGAGTCGGTTCCCGGCTCCTGCAACAACTATGAC CTCATCACGATTAATTACCACTTCGG9 



300 



CCTTCGACACGGCCCCCCCCGTGICA 

•DPV 



T,.Tir.rfr.ftAACAGCTCAGCCAACGGCCGAGGACGTTGTTGATACTCCAGTACTCCTAATTAATGGTCAACCCC 



0 A V P G A A 0 V . G L V E S V A G S C N N Y E L « T . . Y Q F G 
GACGTCCACGCTCATG GCGCCATGATCCGCCCTCACGCGCCGTCGCTTGAGGCGGAGCATCAGCCCATCGTTCG TCATGTGTTGGCCGCGGGTGACTTTT ^ 
rrnrAGCTGCGAGTACCGCCGTACTAGGCGCGACTCCGCCGCAGCGAAcicCGCCTCCTAGTCCGGTACCAAGCACTACACAACCCGCCCCCACTGAAAA 



OVOAHGAM I R 



AQAASLEAEHQAIVRDVLAAGDF 
rTrATTACCCAGTTGGGCCGTAACTTCCAGQTGATCTACGAGCAGGCCAACGCCCACGGGCAGAAGGT 



GGGCCGGCGCCGGTTCGGT GGCTTGCC AGGAGTTCATTACCCAGTTGGGCCG I aal x i uuauu . um . ^ . ^u^^^.^^^w^ ^ 

CCCCGCCGCGCCCAAGCCACCGAACGGTCCTCAAGTAATGGGTCAACCCGCCATT GAAGGTCCACTAG^ 



WCCAGSVACOE 



FITQLCRNFQVIYEOANAHGQKV 
GCAGGCTGCCGGCAACAAC ATGGCCCAAACCGACAGCGCCGTCGGCTCCAGCTGGGCCACTAGrATGACCCTTT TGCATCCTCATATCCCACAGTTGGTG 

cgtccgacggccgt;gttg1accgcgtttcgctgtcgcggcacccgaggtcgacccggtgatc> ^actcggaaaacctac gagt 

Q A A G N N n A 0 T D S A V C S S W A T S « S L L D A H . P 0 L V 

UGCCGCACACGArCGGTCAGGCCGAGCAGGCGGCGATGTCGGCTCAGCCGTTTCACCAGGGGG 



ccgagggtcIgccgcaaacggcggttccgccccgactacgccgtgtgctagccact ccgcctcgtccgccgctacagccgagtiicgcaaagtggtccccc 

A S 0 S A F A a K A G L .1 R H T I G 0 A E 0 A A M S A Q A F H 0 G 
ACTCCTCGGCGGCGTTTCAGGCCCCCCATGCCCGGTTTGTGCCGGCGGCCGCCAAAGTCAACACCTTGT TCGATSTCGCGCAGCCGAATCTGGGTGAGGC 
TCAGCAGCCGCCGCAAAGTCCGCCGGGTACGGGCCAAACACCGCCGCCGGCGGTTTC AGTTGTGGAACAACCTACAGCGCGTCCGCTTACACCCACTCCG 

- — — — MCI — — — — — — — 

NILLOVAQAMLGEA 



-4- 



ESSAAFOAAHARFVAAAAKV 

CGCCGGTACCTATGTGGCCCCCGATGCTG CGG CCGCGTCCACCTATACCGGCTTCGATATCATGGATTTCG GGCTTTTACCTCCGCAAGTGAATTCAAGC 
GCCGCCATGGATACACCGGCGGCTAC6AC 



CGCCGGCGCAGCTGGATATGCCCCAAG CTATAGT ACCTAAAGCCCGAAAATGGAGCCCTTCACTTAAGTTCG 



.MSL— 4[nv J mTCC2 



AGTYVAAOAAA 



ASTYTGFOIMOFGLLPPEVNSS 



T 

Trrnr.Trrr.ftRncrnCAGTCGATGCTAGCCGCCGCGGCCGCCTGGGACGGTGTGGCCGCGCAGTTGACTTCCGCCGCGGTCTCGTATGGAT 



TCAGCTACGATCGGCGGCGCCGCCGCACCCTGCCACACCGGCGCCTCAACTGAAGGCCGCGCCAGAGCATACCTA 



CGAATCTAT TCCGGTCCGGGGCCGGAGTCGATG CTAGCCGCLbLbbLLULu i uuumuuo . » . w 

GCTTACATAAGGCCACGCCCCGGCC 

R M Y S G P G P E S « L A A A A A W 0 G V A A E L T S A A V S Y G 

CCGTCCTGTCGACGCTCATCGTTGAGCCGTGCATGCGGCCGCCCGCGGCCGCGArCGCGGCCGCGGCAACGCCGTATGTGGCGTGGa ^ 
GCCACCACAGCTGCCACTAGCAACTCGGcicCTACCCCGGCCGCCGCCGGCGCTA CCCCCGGCGCCGTTGCGGCATACACCCCACCGACCCG^ 

mTCC2i — — 

rtAAAAMAAAATPYVCWLAATA 
SVVSTLIVEPWMGPAAAAnAAAAir 
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G CCGCTGCCGAACGAGACCCCCACACACGCGAGCGCAGCGCCGGAACCCTTTGGGACCCCCTTCCCCA TCACGCTGCCACCATCCCTCCTCCCCCCCAAC 
CCGCGACCGCTTCCTCTCCCCGTGTGTCCGCTCCCGTCGCCGCCTTCCCAAACCCT GCCCCAAGCGCTACTGCCACCGTCCTAGCGAGCAGCCCCC^ 

A L A K E T A. T 0 A R A A A E A F G r A F A- n T V P P S L V A A N 

r.r..rrnr.TTr.ATfiTrr.CTGCTCGCCCCGAACATTCT GGGGCAAAACAGTGCCGCGATCGCGGCTACCCAGCCCGAGTATGCCGAAATGTGCGCCCAAG 

' ' ACCGCCGATGGGTCCGGCTCATACGCCTTTACACCCGGGTTC 



GCGTCGGCCAACTACAGCCACCAGCGCCGCTTGTAAGACCCCGTTTTGTCACCCCCCTAG 
■ mTCC2- 



R S R L « S U V A A N . L G 0 N S A A I A A T Q A E r A E M W A 0 
ACGCTGCCGTGATGTACAGCTATGAGCCGGCATCTGCGGCCGCGTCCGCGTTGCCGCCGTTCACTCCACCCCTGCAAGGCACCGGCCCGCCCCCCCCCGC 



GCAACGGCCGCAAGTGAGGTGGQCACGTTCCOTCGCCGCGCCCGCCCGGGCC 



T0CGAC0GCACTACATGTCGATACTCCCCCG7AGACGCCGGCGCAGCC 

0 A A y M y S r E 0 A S A A A S A L P P F T P P V 0 G T G P A G P A 

GCCCGCACCCGCGCCGACCCAAGCCCCCGGTGCCGGCCCCGTTGCGGATGCACAGGCGACACTGGCCCACCTCCCCCC GGCGATCCTCAGCGACATTCTG 
CCGCCGTCGGCGCCGCTGGGTTCCGCGGCCACCCCCGCGGCAACGCCTACGTGTCCGCTGTCACCGGGTCGACCGGGGCCCCTAGGACTCGCTGTAACAC 



AAAAATOAAGA 



• mTCC2" 



GAVADAOATLAQLPPGILSD 



TCCCCATTGGCCCCCAACGCTGATCCCCTGACATCGGGACTGTTGGGGATCGCGTCGACCCTCAACCCCCAACTCCGATCC GCTCAGCCGATAGTGATCC 
ACGCGTAACCGGCCGTTGCCACTAGGCCACTGTAGCCCTCACAACCCCTAGCGCAGCT GGGAGTTGGGCGTTCAGCCTAGCCGAGTCGGCTATCACTACC 

MMM^B- iMB-a— -MaMM—— iMMWiM mTCC2"— 

S A L A A N A 0 P L T S G L L G I A S T L N P 0 V G S A Q P 1 V 1 

CCACCCCGATAGGGCAATTGCACGTGATCGCGCTCTACATTGCATCCATCGCGACCCGCACCATTCCCCTCCCCATCACCAAC ACGGCCACACCCTGGCA 

gctggcgctItccccttaacctgcactagcgcgagatctaacgtaggtagcgctggc cgtcgtaacgcgagcgctagtgcttgtgccggtctgggaccgt 

P t P I G E L D V . A L Y I A S I A T G S . A L A . T N T A R P W H ■ 
CATCCGCCTATAC GGGAACGCCCGCGGGCTGGCACCGACGCACGCCC ATCCACTGAGTTCGGCGACCGACGAGCCCGAGCC GCACTGGCGCCCCTTCGGC ^g^^ 

GTAGCCGCATATGCCCT 



:ttccggccccccgaccctggctgcgtcccggtaggtgactcaagccgctggctgctcggcctcggcgtgaccccggggaagccc 



• mTCC2" 



, G L Y G N A G G L G P T 0 G H P L S S A T 0 E P E P H W G P F G 
CCCCCGGCGCCGCTGTCCCCCCGCGTCGCCCACGCAGCATTAGTCCGAGCGTTOTCGGTGCCGCACAGCTGGACCACG GCCGCCCCGGACATCCAGCTCG 

ccccgccgcccccacagccgcccgcagcccgtgcgtcgtIatcagcctcgcaacagcca ccgcgtbtcgacctgctgcccgcggggcctctaggtcgagc 

C A A P V S A G V G H A A L V G A L S V P H S W T T A A P E I 0 L 
CCGTTCAGGCAACACCCACCTTCAGCTCCAGCGCCGGCGCCGACCCGACGGCCCTAAACGGGATGCCCGCACCCCTGCTCA GCCGGATGGCTTTCCCGAG ^ 

ggcaagtcccttgtgggtggaagtccaggIcccggccgcggctcggctgccgggatttgccc tacggccgtccggacgactcgccctaccgaaaccgctc 

A V. 0 a T P T F S S S A G A D P T A L N G M P A G L L S G H A U A S 

CCTCGCCCCACGCCGCACCACCGGCGGTGGCCGCACCCGTAGCGGCACCAGCACTGACGCCCAAGAGCACGCCCGC AAACCCCCGCrAGTTGTGATTA^ 

GCACCGGCGTGCGCCCTGCTGCCCGCCACCGCCGTGGGCATCGCCGTGGTCGTGACTCCCCGTTCTCCTGCCCCCGTTTCGGCCCCATCAACACTAATCT 



I R 



L A A R G T T G G G G T R S G T S T 0 C 0 E 0 G R K P P V V V 

GACCACCCGCCCCCCCGAAACCCCCCCCGGTAAGATTTCTAAATCCATCACACTGGCGGCCGCTCCAC ^^^^ 
CICGTCCCCCCCCCGCCTTICCGCGCCCCCAT TCTAAAGATTTACGTAGTGTCACCGCCGCCGACCrC 
■ mTCC2 I I pETpolylink ef j Xhol | 

EOPPPGMPPR. OF. IHHIGGRSS 

Fig. 2»2. 
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CATATGCATCACCATCACCATCACGATCCCGTGGACGCGGTCATTA ACACCACCTGCAATTACGSGCAGGTAGTAGCTGCGCTC^^^ 

' ' ^ * ^ ' H— ! ^ 1 . +-« , ^ ^. 

U6GACGTTAATGCCCGTCCATCATCGACGCGAGTTGCGCT6CCTAGGCC 



GTATACGTAGTGGTAGTGGTAGTGCTAGGGCACCTGCGCCAGTAATTGTGGT'^"- ' ' ' ' 



" " " H H D P V D A V I N T T C N Y G 0 V V A A t N A T 0 P 

GGGCTGCCGCACAGTTCAACGCCTCACCGGTGGCGCAGTCCTATTTGCG^^ 

CCCGACGGCGTGTCAAGTTGCGGAGTGGCCACCGCGTCAGGATAAACGCGTTAAAGGAGCGGCGTGGCGGTGGAGTCGCGCGACG^ 



GAAAOFNASPVAQSY 



LRNFLAA PPPQRAAilAAOL 



GCAAGCTGTGCCGGGGGCGGCACAGTACATCGGCCTTGTCGAGTCGGTTGCCGGCK^ 

CGTTCGACACGGCCCCCGCCGTGTCATGTAGCCGGAACAGCTCAGCCAACGGCCGAGGACGTTGTTGATACTCGAGTACTG^ 

0 A V P G A A 0 Y I G L V E S V A G S C N N Y E L « T I N Y 0 F 6 

GACGTCGACGCTCATGGCGCCATGyCCGCGCTCAGGCGGCGTCGCTTGA^^^ 

CTGCAGCTGCGAGTACCGCGGTACTAGGCGCGAGTCCGCCGCAGCGAACTCCGCCTCGTAGTCCGGTAGCAAGCAC^ 
D V D A H G A M 1 R A Q A A 5 L E AEHOAIVROVLAAGOF 

GGGGCGGCGCCGGTTCGGTGGCTTG CCAGGAGTTCATTACCCAGTTGGGCCGTAACTTCCAGGTGATCTACGAGCAGGCCAACGCCCACGCGCAGAAgr^^ 
' ' ' ' ' ' ' ■ ' ■ ■ I ■ I t 1 I { I _ I 

CCCCGCCGCGGCCAAGCCACCGAACGGTCCTCAAGTAATGGGTCAACCCGGCATTGAAGGTCCACTAGATGCTCGTCCGGTTGCGGGTGCCCGTCTTCCA 
^^^^^SV ACQ EF t TO LGRNFOVIYEOANAHGOKV 

' ' ' ' ' " ' ■ ' ' — ' I ^. L 

GCAGGCTGCCGGCAACAACATGGCGCAAACCGACAGCGCCGTCGGCTCCAGCTGGGCCAC TAGTATGAGCCTTTTGG^ 

' * ^ ' ' ) - ' I ^ - I I I I I I I ■ » ■ ■ t » ■ > II I I I I I I I I I H . j ■■■ I I I I ■ . 1 1 ■ I ■ ■ ■ \ I I I I [ I I I I < J J 

CGTCCGACGGCCGTTGTTGTACCGCGTTTGGCTGTCGCGGCAGCCGAGGTCGACCCGGTGATCATACTCGGAAAACCTACGAGTATAGGGTGTCAACCAC 



OAAGNNMAOTOS 



AVGSSWATSHSLLOAHIPQLV 



GCCTCCCAGTCGGCGTTTG CCGCCAAGGCGGGGCTGATGCGGCACACGATCGGTCAGGCCGAGCAGGCGGCGATGTCGGCTCAGGCGTTTCACrARftftRn 
' ' I I > t ■ I I 1 i I . I I , ] , J ^ J 

CGGAGGGTCAGCCGCAAACGGCGGTTCCGCCCCGACTACGCCGTGTGCTAGCCAGTCCGGCTCGTCCGCCGCTACAGCCGAGTCCGCAAAGTGGTCCCCC 
A S 0 S A F A A K A G L M R H T I G Q A £ 0 A A H S A 0 A F H Q G 

AGTCGTCGGCGGCGTTTCAGGCCGCCCATGCCCGGTTTGTGGCGGCGGCC^^ 

TCAGCAGCCGCC6CAAAGTCCGGCGGGTACGGGCCAAACACCGCCGCCGGCGGTTTCAGTTGTGGAACAACCTACAGCGCGTCCGCTTAGACCCACTCCG 

ES SAAFQ AAH ARF V A A A A K V N T L L 0 - V A 0 A N L G E A 

' ' ' ' ' ' I .... I I I I , t 

CGCCGGTACCTATGTGGCCGCCGATGCTGCGGCCGCGTCGACCTATACCGGGTTC^ 

GC6GCCATGGATACACCGGCGGCTACGACGCCGGCGCAGCTGGATATGGCCCAAGCTATAGGTAGTGTGACCGCCGGCGAGCTCGTCTAGGCCGACGATT 
^ ^ ^ V A AOAAAASTYTGFOiHHTGGRSSRSGC 

' ' ' ' ' * *• .i-i — — , . ■ ■ . > ■ . ■ *■ — ■ ..L 



"5 
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. 1 i H- 921 

GTTTCGGGCTTTCCTTCGACT 

Q S P K G S 
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CATATGCATCACCATCACCATCACATGGTGGATTTCGGGGCGTTACCACCGGAGATCAACTCCGCGAGGATGTACGCCGGCCCGGGTTCGGCCTCGCTGG 

I t ■ 1 \ 1 1 1 i ' i ' 1 ' f ' I 1 >■ ! H lO 

GTATACGTAGTGGTAGTGGTAGTGTACCACCTAAAGCCCCGCAATGGTGGCCTCTAGTTGAGGCGCTCCTACATGCGGCCGGGCCCAAGCCGGAGCGACC 

HMHHHHH HMVOFGALPPEINSARHYAGPGSASL 
H — ^ ' ' 1 1 1 . — ' — ■ ■ ' ' ' ' 1 « — *- 

TGGCCGCGGCTCAGATGTGGGACAGCGTGGCGAGTGACCTGTTTTCGGCCGC6TCGGCGTTTCAGTCGGTGGTCTGGGGTCTGACGGTGGGGTCGTGGAT 

. 1 . 1 , , , 1 I i * 1 ' • 1 . 1- 20 

ACCGGCGCCGAGTCTACACCCTGTCGCACCGCTCACTGGACAAAAGCCGGCGCAGCCGCAAAGTCAGCCACCAGACCCCAGACTGCCACCCCAGCACCTA 

VAAAOrtWOSVASDLFSAASAFQSVVWGLTV G»S W I 

AGGTTCGTCGGCGGGTCTGATGGTGGCGGCGGCCTCGCCGTATGT6GCGTGGATGAGCGTCACCGCGGGGCAGGCCGA6CTGACCGCCGCCCAGGTCCGG 

. 1 1 . . I . 1 I 1 ' " I 1 1 ■ ' I 1 I 1 ■ I 30 

TCCAAGCAGCCGCCCAGACTACCACCGCCGCCGGAGCGGCATACACCGCACCTACTCGCAGTGGCGCCCCGTCCGGCTC6ACTGGCGGCGGGTCCAGGCC 

GSSAGLMVAAASPYVAWMSVTAGOAELTAAQVR 

GTTGCTGCGGCGGCCTACGAGACGGCGTATGGGCTGACGGTGCCCCCGCCGGTGATCGCCGAGAACCGTGCTGAACTGATGATTCTGATAGCGACCAACC 

1 1 , 1 ■ I 1 1 1 i " 1 ' \ ' 1 ■ 1 ' ■ » H HC 

CAACGACGCCGCCGGATGCTCTGCCGCATACCCGACTGCCACGGGGGCGGCCACTAGCGGCTCTtGGCACGACTTGACTACTAAGACTATCGCTGGTTGG 

VAAAAYETAYGLTVPPPVIAENRAELMIL lATN 

TCTTGGGGCAAAACACCCCGGCGATCGCGGTCAACGAGGCCGAATACGGCGAGATGTGGGCCCAAGACGCCGCCGCGATGTTTGGCTACGCCGCGGCGAC 
1 -4- -I—, 1 , ! 1 1 1 1 1 1 ' 1 ^ 1 1- 5C 

AGAACCCCGTTTTGTGGGGCCGCTAGCGCCAGTTGCTCCGGCTTATGCCGCTCTACACCCGGGTTCTGCGGCGGCGCTACAAACCGATGCGGCGCCGCTG 
LUGQNTPAIAV NEAEYGEMWAQOAAAMFGYAAAT 

GGCGACGGCGACGGCGACGTTGCTCCCGTTCGAGGAGGCGCCGGAGATGACCA6CGCGGGTCGGCTCCTC6AGCAGGCCGCCGCGGTCGAGGAGGCCTCC 

i 1 . ■ 1 , 1 ( • 1 ' ' ' I 1 1 1 ' 1 ' 1 ' H 6( 

CCGCTGCCGCTGCCGCTGCAACGACGGCAA6CTCCTCCGCGGCCTCTACT6GTC6CGCCCACCCGAGGAGCTCGTCCGGCGGCGCCAGCTCCTCCGGAGG 

ATATATLLPFEEAPEHTSAGGLLEQAAAVEEAS 
I I ■ ■ ■ ■ ' ■ ■ ■ ' ■ ' 

GACACCGCCGCGGCGAACCAGTTGATGAACAATGTGCCCCAGGCGCTGCAACAGCTGGCCCAGCCCACGCAGGGCACCACGCCTTCTTCCAAGCTGGGTG 

1 1 , ^ . . I 1 • 1 ' 1 ^ 1 ' \ 1 'I ' t 7< 

CTGTGGCGGCGCCGCTTGGTCAACTACTTGTTACACGGGGTCCGCGACGTTGTCGACCG6GTCGGGTGCGTCCCGTGGTGCGGAAGAAGGTTCGACCCAC 

DTAAANQLMNNVPQALQOLAO PTQGTTPSSKLG 

GCCTGTGGAAGACGGTCTCGCCGCATCGGTCGCCGATCAGCAACATGGTGTCGATGGCCAACAACCACATGTCGATGACCAACTCGGGTGTGTCGATGAC 

— t I 4 1 1 1 ■ 1 t 1 ' ■ 1 ■ ' I 1 I ' I ■ 1- 8 

CGGACACCTTCTGCCAGAGCGGCGTAGCCAGCGGCTAGTCGTTGTACCACAGCTACCGGTTGTTGGTGTACAGCTACTGGTTGAGCCCACACAGCTACTG 

GLWKTVSPHRSP ISNMV SMANNH MSMTNSGVSMT 

■ I I 1 • ' ' • ' ' - ' ^ 

CAACACCTTGAGCTC3ATGTTGAAGGGCTTTGCTCCGGCGGCGGCCGCCCAGGCCGTGCAAACCGCGGCGCAAAACGGGGTCCGGGCGATGAGCTC6CTG 

, , , , , H- • 1 ' 1 1 ' »• ? 

GTTGTGGAACTCGAGCTACAACTTCCCGAAACGAGGCCGCCGCCGGCGGGTCCCGCACGTTTGGCGCCGCGTTTTGCCCCAGGCCCGCTACTCGAGCGAC 

NTLS5.'^(_-<0FAPAAAA0AV0TAAQNGVRAMSSL 
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SGCAGCTCGCTGGGTTCTTCGGGTCTGGGCGGTGGGGTGGCCGCCAACTTGGGTCGGGCGGCCTCGGTCGGTTCGTTGTCGGTGCCGCAGGCCTGGGCCG 

1 1 1 > ■ ' I ■ ' ! ' 1 ' 1 ' ' ' 1 ' h-^ *^ 1- tc 

CCGTCGAGCGACCCAAGAAGCCCAGACCCGCCACCCCACCGGCGGTTGAACCCAGCCCGCCGGAGCCAGCCAAGCAACAGCCACGGCGTCCGGACCCGGC 

GSSLGSSGLGGGVAANLGRAASVGSLSVPQAWA 

■ ' . ■ ■ ■ • ■ - ■ " ' 

ZGGCCAACCAGGCAGTCACCCCGGCGGCGCGGGCGCTGCCGCTGACCAGCCTGACCAGCGCCGCGGAAAGAGGGCCCGGGGAGATGCTGGGCGGGCTGCC 

, 1 , 1 1 1 I 1 1 I ' I ' 1 ' I ■ ■ ■ I 1 \ 1- n 

3CCGGTTGGTCCGTCAGTGGGGCCGCCGCGCCCGCGACGGCGACTGGTCGGACTGGTCGCGGCGCCTTTCTCCCGGGCCCGTCTACGACCCGCCCGACGG 

i^ANQAV TPA ARALPLTSLTSAAeRGPGOMLGGLP 

GGTGGGGCAGATGGGCGCCAGGGCCGGTGGTGGGCTCAGTGGTGTGCTGCGTGTTCCGCCGCGACCCTATGTGATGCCGCATTCTCCGGCAGCCGGCAAG.,. 

I I ■ 1 ' I I 1 1 1 ■ ■ ■ ' 1 ' ! ' 1 I 11 i 

CCACCCCGTCTACCCGCGGTCCCGGCCACCACCCGAGTCACCACACGACGCACAAGGCGGCGCTGGGATACACTACGGCGTAAGAGGCCGTCGGCCGTTC 

VGO MGARAGGGLSGVLRVRPRPYVMPHSPAAGK 

CTTGATCCCGTGGACGCGGTCATTAACACCACCTGCAATTACGGGCAGGTAGTAGCTGCGCTCAACGCGACGGATCCGGGGGCTGCCGCACAGTTCAACG 

, 1 , 1 , 1 , 1 -H 1 1 I ' 1 1 1 ' 1 1 h i: 

GAACTAGGGCACCTGCGCCAGTAATTGTGGTGGACGTTAATGCCCGTCCATCATCGACGCGAGTTGCGCTGCCTAGGCCCCCGACGGCGTGTCAAGTTGC 

LOPVOAV I NTTCNYGQVVAALNATOPGAAAQ FN 

CCTCACCGGTGGCGCAGTCCTATTTGCGCAATTTCCTCGCCGCACCGCCACCTCAGCGCGCTGCCATGGCCGCGCAATTGCAAGCTGTGCCGGGGGCGGG-v^ 

' t I I 1 1 1 > 1 ' 1 ' 1 1 1 ' 1 ' 1 ' — U ' 

GGAGTGGCCACCGCGTCAGGATAAACGCGTTAAAGGAGCGGCGTGGCGGTGGAGTCGCGCGACGGTACCGGCGCGTTAACGTTCGACACGGCCCCCGCCG^- 

ASPVAQS YLRNFLAAPPPQRAAMAAOLQAVPGAA 

, , , . ■ . I - . . > ■ ' • ' ■ ■ ■ ■ - ' - ' 

ACAGTACATCGGCCTTGTCGAGTCGGTTGCCGGCTCCTGCAACAACTATGAGCTCATGACGATTAATTACCAGTTCGGGGACGTCGACGCTCATGGCGCC 

, 1 , 1 , 1 ^ 1 1 1 1 1 1 1 < 1 i 1 ■ h 1 

TGTCATGTAGCCGGAACAGCTCAGCCAACGGCCGAGGACGTTGTTGATACTCGAGTACTGCTAATTAATGGTCAAGCCCCTGCAGCTGCGAGTACCGCGG 

QY IGLVESVAGSCNNYELMTINYQFGDVOAHGA 

ATGATCCGCGCTCAGGCGGCGTCGCTTGAGGCGGAGCATCAGGCCATCGTTCGTGATGTGTTGGCCGCGGGTGACTTTTGGGGCGGCGCCGGTTCGGTGG 

— * I . I 1 I 1 ^ 1 « ^ 1 ' h— H h— H K I 

TACTAGGCGCGAGTCCGCCGCAGCGAACTCCGCCTCGTAGTCCGGTAGCAAGCACTACACAACCGGCGCCCACTGAAAACCCCGCCGCGGCCAAGCCACC 

rilRAQAASLEAEHOAIVRDVLAAGDFWGGAGSV 

■ ■ - . . , . . . ■ , ■ - I ■ . » - ■ I , . , ■ 1 I 

CTTGCCAGGAGTTCATTACCCAGTTGGGCCGTAACTTCCAGGTGATCTACGAGCAGGCCAACGCCCACGGGCAGAAGGTGCAGGCTGCCGGCAACAACAT 

— . 1 1 , \ 1 1 1 ■ ■ I ' 1 ^. ^ ^ 1 ' ' ' ■ ■ ■ ' 1 

GAACGGTCCTCAAGTAATGGGTCAACCCGGCATTGAAGGTCCACTAGATGCTCGTCCGGTTGCGGGTGCCCGTCTTCCACGTCCGACGGCCGTTGTTGTA 

ACQEF I TOLGRNFOVIYEOANAHGQKVOAAGNNM 

' ^ ■ ■ ■ ■ t I I .I .— 1. - 1 ■ « ■ ■«, I I 

GGCGCAAACCGACAGCGCCGTCGGCTCCAGCTGGGCCACTAGTAACGGCCGCCAGTGTGCTGGAATTCTGCAGATATCCATCACACTGCCGGCCGCTCCAO 

I +- . 1 — . 1 1 1 1 1 ' 1 ' H ' ' ' — H 

CCGCGTMGGCTGTCGCGGCAGCCGAGGTCGACCCGGTGATCATTGCCGGCGGTCACACGACCTTAAGACGTCTATAGGTAGTGTGACCGCCGGCGAGCTC 

AOTDSAVGSSWATSNCROCAGILQISI TLAAAR 
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CATATGCATCACCATCACCATCACATGGCCACCACCCTTCCCGTTCAGCGCCACCCGCGGTCCCTCTTCCCCGAGTTTTCTGAGCTGTTCGCGGCaiTCC 
1 1 ■ 1 ^ 1 ' 1 ' 1 ' \ ' — — t * 1 ^ ^ - . ■ 1 IOC 

&TATACGTAGTGGTAGTGGTAGTGTACCGGTGGTGGGAAGGGCAAGTCGCGGTGGGCGCCAGGGAGAAGG6GCTCAAAAGACTCGACAAGCGCCGGAAGG 

HMHHHHH HMATTLPVORHPRSLFPEFSELFAAF 

' - ■ . - - I 1 . — ■ I -■ . ■ . . , 

CGTCATTCGCCGGACTCCGGCCCACCTTCGACACCCGGTTGATGCGGCTGGAAGACGAGATGAAAGA6GGGCGCTACGAGGTACGCGCGGAGCTTCCCGG 

. 1 1 1 r-l ' 1 ' 1 •-—H p 1 . ! 1 1 , K 20C 

^CAGTAAGCGGCCTGAGGCCGGGTGGAAGCTGTGGGCCAACTACGCCGACCTTCTGCTCTACTTTCTCCCCGCGATGCTCCATGCGCGCCTCGAAGGGCC 

PSFAGLRPTFDTRLMRLEO<eMKEGRYEVRAELPG 

GGTCGACCCCGACAAGGACGTCGACATTATGGTCCGCGATGGTCAGCTGACCATCAAGGCCGAGCGCACCGAGCAGAAGGACTTCGACGGTCGCTCGGAA 

' I ■ I ' I I • 1 1 1 1 1 ' 1 « i 1 1 ' I I 30( 

CCAGCTGGGGCTGTTCCTGCAGCTGTAATACCAGGCGCTACCAGTCGACTGGTACTTCCGGCTCGCGTGGCTCGTCTTCCTGAAGCTGCCAGCGAGCCTT 

VDPOKOVO I MVROGOLT fKAERTEOKOFOGRSE 

TTCGCGTACGGTTCCTTCGTTCGCACGGTGTCGCTGCCGGTAGGTGCTGACGAGGACGACATTAAGGCCACCTACGACAAGGGCATTCTTACTGTGTCGG 

1 1 1 1 ■ ■ ■ I 1 ' ' I 1— ^ ' 1 ' 1 ' ' ' I ' ■ 1 1 1 ' I I 

AAGCGCATGCCAAGGAAGCAAGCGTGCCACAGCGACGGCCATCCACGACTGCTCCTGCTGTAATTCCGGTGGATGCTGTTCCCGTAAGAATGACACAGCC 

FAYGSFVRTVSLPVGADEDDIKATYOKGILTVS 
■ 1 ■ , ' I I 1 ^ , — I— — . i.,.,.-. ... ..I,— . ' - ■ 

TGGCGGTTTCGGAAGGGAAGCCAACCGAAAAGCACATTCAGATCCGGTCCACCAACAAGCTTGATCCCGTGGACGCGGTCATTAACACCACCTGCAATTA 

1 1 I 1 1 * 1 • 1 ' 1 ' ' ■ ' * 1 > »■ 50 

ACCGCCAAAGCCTTCCCTTCGGTTGGCTTTTCGTGTAAGTCTAGGCCAGGTGGTTGTTCGAACTAGGGCACCTGCGCCAGTAATTGTGGTGGACGTTAAT 

VAVSEGKPTEKHIQiRSTNKLDPVOAVINTTCNY 

■ ■ - . ■ - ■ ■ . . . ■ . . ■ . . . ■ . ■ . I 1 ■ . ■ ■ . I 1 . 1 - - ■ ■ . 1 

CGGGCAGGTAGTAGCTGCGCTCAACGCGACGGATCCGGG6GCTGCCGCACAGTTCAACGCCTCACCGGTGGCGCAGTCCTATTTGCGCAATTTCCTCGCC 

' ' 1 1 1 1 1 ^ 1 ■ ' ' t ' ' 1 1 1 « i 1 ( 1 I 6C 

GCCCGTCCATCATCGACGCGAGTTGCGCTGCCTAGGCCCCCGACGGCGTGTCAAGTTGCGGAGTGGCCACCGCGTCAGGATAAACGCGTTAAAGGAGCGG 

GOVVAALNATDPGAAAQFNAS'PVAQSYLRNFLA 
■' " ■■III t I. , . r „ 11, . I I I ■ ■ i — I >■ ■ ■ 1 * — I 

GCACCGCCACCTCAGCGCGCTGCCATGGCCGCGCAATTGCAAGCTGTGCCGGGGGCGGCACAGTACATCGGCCTTGTCGAGTCGGTTGCCGGCTCCTGCA 

■ ' I ' ' ' I < I [ ■■,.( I ,. I [ . t 1 I . n 1 1 1 1 ! 1 I 7( 

CGTGGCGGTGGAGTCGCGCGACGGTACCGGCGCGTTAACGTTCGACACGGCCCCCGCCGTGTCATGTAGCCGGAACAGCTCAGCCAACGGCCGAGGACGT 

APPPOR AAMAAOLQAVPGAAOY ! GLVESVA GSC 

ACAACTATGAGCTCATGACGATTAATTACCAGTTCGGGGAC6TCGACGCTCATGGCGCCATGATCCGCGCTCAGGCGGCGTCGCTTGAGGCGGAGCATCA 

I 1 I I I 1 t 1 1 ' ' ' I ' ■ ' ' 1 1 1 1 1 ' ■ < »• 8( 

TGTT6ATACTCGAGTACTGCTAATTAATG6TCAAGCCCCTGCAGCTGCGAGTACCGCGGTACTAGGCGCGAGTCCGCCGCAGCGAACTCCGCCTCGTAGT 

NNYELMT INYQFGOVOAHGAMIRAQAASLEAEHQ 

GGCCATCGTTCGTGATGTGTTGGCCGCGGGTGACTTTTGGGGCGGCGCCGGTTCGGTGGCTTGCCAGGAGTTCATTACCCAGTTGGGCCGTAACTTCCAG 

■ ■ ■ 1 ' ■ I 1 1— — ^ 1 1 1 ^ 1 ' ' 1 I 1— I 1 < h o 

CCGGTAGCAAGCACTACACAACCGGCGCCCACTGAAAACCCCGCCGCGGCCAAGCCACCGAACGGTCCTCAAGTAATGGGTCAACCCGGCATTGAAGGTC 

A I VROvlaaGDFWGGAGSVACQEF I TOLGRNFO 
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CTGATCTACGAGCAGGCCAACGCCCACGGGCAGAAGGTGCAGGCTGCCGGCAACAACATGGCGCAAACCGACAGCGCCGTCGGCTCCAGCTGGGCCACTA 
1 f 1 1 1 1 1 1 1 I . ■ ■ t 1 ' 1 1 ■ ' I 

CACTAGATGCTCGTCCGGTTGCGGGTGCCCGTCTTCCACGTCCGACGGCCGTTGTTGTACCGCGTTTGGCTGTCGCGGCAGCCGAGGTCGAtCCGGTGAT 
V I YEQANAHGOK VOAAGNNMAQTOSAVGSSWAT 

GTAACGGCCGCCAGTGT6CTGGAATTCTGCAGATATCCATCACACTGGC6GCCGCTCGA6CAGATCCGGCTGCTAACAAAGCCCGAAAGGAAGCTGAGTT 
' 1 ' 1 i i 1 1 1 1 • 1; ' I 1 H 1 , , I J. 

CATTGCCGGCGGTCACACGACCTTAAGACGTCTATAGGTAGTGTGACCGCCGGC6AGCTCGTCTAGGCCGACGATTGTTTCGGGCTTTCCTTCGACTCAA 
SNGRQCAG I LO I SITLAAARAOPAANKAR KEAEL 

GGCT 
CCGA 



II B 
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CATATGCATCACCATCACCATCACATGGTGGATTTCGGGGCGTTACCACCGGAGATCAACTCCGCGAGGATGTACGCCGGCCCGGGTTCGGGCTCGCTGG 

I 1 1 I ■ 1 ' ■ ' I I ' ' I ' 1 1 ' H 100 

BTATACGTAGTGGTAGTGGTAGTGTACCACCTAAAGCCCCGCAATGGTGGCCTCTAGTTGAGGCGCTCCTACATGCGGCCGGGCCCAAGCCGGAGCGACC 

HHHHHHH HMVOFGALPPEINSARMYAGPGSASL 

TGGCCGCGGCTCAGATGTGGGACAGCGTGGCGAGTGACCTGTTTTCGGCCGCGTCG6CGTTTCAGTCGGTGGTCTGGGGTCTGACGGTGGGGTCGTGGAT 

I . . I I 1 1 ■ I 1 I I 1 < i ' i 1 h 200 

ACCGGCGCCGAGTCTACACCCTGTCGCACCGCTCACTGGACAAAAGCCGGCGCAGCCGCAAAGTCAGCCACCAGACCCCAGACTGCCACCCCAGCACCTA 

yAAAOMWDSVA'SDLFSAASAFOSVVWGL TVGSWI 
■ • -J , , I . i-j — ■ 1 I ■ *— — — 1 ■ '' 

AGGTTCGTCGGCGGGTCTGATGGTGGCGGCGGCCTCGCCGTATGTGGCGTGGATGAGCGTCACCGCGGGGCAGGCCGAGCTGACCGCCGCCCAGGTCCGG 

I 1 1 1 ( ^ \ 1 ' ' I ■ I ■ I • 1 ' 1 < 1 ' ■ H 300 

TCCAAGCAGCCGCCCAGACTACCACCGCCGCCGGAGCGGCATACACCGCACCTACTCGCAGTGGCGCCCCGTCCGGCTCGACTGGCGGCGGGTCCAGGCC 

GSSAGLMVAAASPYVAWMSVTAGOAELTAAOVR 

3TTGCTGCGGCGGCCTACGAGACGGCGTATGGGCTGACGGTGCCCCCGCCGGTGATCGCCGAGAACCGTGCT6AACTGATGATTCTGATAGCGACCAACC 

1 , ( ^ 1 , 1 1 1 . 1 1 1 i 1 ^ 1 1 h qOC 

CAACGAC6CCGCCGGATGCTCTGCCGCATACCCGACTGCCACGGGGGCGGCCACTAGCGGCTCTTGGCACGACTTGACTACTAAGACTATCGCTGGTTG6 

VAAAAYETAYGLTVPPPVIAENRAELMILIATN 

I I ■ • ill t ■ ■ ■ I t ■ - ■ ■ . ■ ■ I . ■ , ■ ■ ■ I 1.,, ■ , ,. ■ I ■ ■■■■ ■ i 

TCTTGGGGCAAAACACCCCGGCGATCGCGGTCAACGAGGCCGAATACGGCGAGATGTGGGCCCAAGACGCCGCCGCGATGTTTGGCTACGCCGCG6CGAC 

, ( , 1 ■ , , 1 . 1 1 . I . 1 1 1 1 ' 1 ' i ' H soo 

AGAACCCCGTTTTGTGGGGCCGCTAGCGCCAGTTGCTCCGGCTTATGCCGCTCTACACCCGGGTTCTGCGGCGGCGCTACAAACCGATGCGGCGCCGCTG 

LLGQNTPA I AVNEAEYGEMWAQOAAAMFGYAAAT 

GGCGACGGCGACGGCGACGTTGCTGCCGTTCGAGGAGGCGCCGGAGATGACCAGCGCGGGTGGGCTCCTCGAGCAGGCCGCCGCGGTCGAGGAGGCCTCC 

I 1 1 ^ 1 1 1 1 ■ I [ ' 1 1 ' 1 " 1 ' H 60C 

CC6CTGCCGCTGCCGCTGCAACGACGGCAAGCTCCTCCGCGGCCTCTACTGGTCGCGCCCACCCGAGGAGCTCGTCCGGCGGCGCCAGCTCCTCCGGAGG 

ATATATLLPFEEAPEMTSAGGLLEQAAAVEEAS 
- ■ I I ■ ■ ■ ■ — . 1 ' ' « ' ' ' 

GACACC6CCGCG6CGAACCAGTTGATGAACAATGTGCCCCAGGCGCTGCAACAGCTGGCCCAGCCCACGCAGGGCACCACGCCTTCTTCCAAGCTGGGTG 

-~H 1 1 H i i i -H I ' 1 ' 1 ^ 1 • 1 " ■ ' I 70C 

CTGTGGCGGCGCCGCTTGGTCAACTACTTGTTACACGGGGTCCGCGACGTTGTCGACCGGGTCGGGTGCGTCCCGTGGTGCGGAAGAAGGTTC6ACCCAC 

DTAAANOLMNNVPOALOOLAOPTOGTTPSSKLG 

GCCTGTGGAAGACGGTCTCGCCGCATCGGTCGCCGATCAGCAACATGGTGTCGATGGCCAACAACCACATGTCGATGACCAACTCGGGTGTGTCGATGAC 

1 1 1 , I , 1 . ■ I < 1 1 1 ■ I 1 < ^ «• 80C 

CGGACACCTTCTGCCAGAGCGGCGTAGCCAGCGGCTAGTCGTTGTACCACAGCTACCGGTTGTTGGT6TACAGCTACTGGTTGAGCCCACACAGCTACTG 

GLWKTVSPHRSP I SNMVSMANNHMSMTNSGVSMT 

CAACACCTTGAGCTCGATGTTGAAGGGCTTTGCTCCGGCGGCGGCCGCCCAGGCCGTGCAAACCGCGGCGCAAAACGGGGTCCGGGCGATGAGCTCGCTG 

1 ■ • I . ( I . I 1 ( I ' ■ ■ ■ t ^ 1 ' I ■ 1- 9a 

GTTGTGGAACTCGAGCTACAACTTCCCGAAACGAGGCCGCCGCCGGCGGGTCCGGCACGTTTGGCGCCGCGTTTTGCCCCAGGCCCGCTACTCGAGCGAC 

ntussmlxgfapaaaaoavotaaqngvramssl 
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GGCAGCTCGCTGGGTTCTTCGGGTCTGGGCGGTGGGGTGGCCGCCAACTTGGGTCGGGCGGCCTCGGTCGGTTCGTTGTCGGTGCCGCAGGCCTGGGCCG 

, I , 1 1 1 1 1 1 1 ' ' 1 ' ■ 1 ' ■ I • h iOO 

CCGTCGA6CGACCCAAGAAGCCCAGACCCGCCACCCCACCGGCGGTTGAACCCAGCCCGCCGGAGCCAGCCAAGCAACAGCCACGGCGTCC6GACCCGGC 

GSSLGSSGL6GGVAANLGRAASVGSLSVPQAWA 

CG6CCAACCAGGCAGTCACCCCGGCGGCGCGGGCGCTGCCGCTGACCAGCCTGACCAGCGCCGCGGAAAGAGGGCCCGGGCAGATGCTGGGCGGGCTGCC 

, I , I I I ■ 1 I ' ' ' I 1 ' [ ' I ' ■ ' I ■■ * 110 

GCCGGTTGGTCCGTCAGTGGGGCCGCCGCGCCCGCGACGGCGACTGGTCGGACTGGTCGCGGCGCCTTTCTCCCGGGCCCGTCTACGACCCGCCCGACGG 

AANOAVTPAARALPLTSLTSAAERGPGOMLGGLP 
' ' ' ' I ' I ' 

GGTGGGGCAGATGGGCGCCAGGGCCGGTGGTGGGCTCAGTGGTGTGCTGCGTGTTCCGCCGCGACCCTATGTGATGCCGCATTCTCCGGCAGCCGGCGAT 

I t 1 , I . ■ ■ I ( ! ' ' 'I ' 1 ' ! 1 i ' K 12C 

CCACCCCGTCTACCCGCGGTCCCGGCCACCACCCGAGTCACCACACGACGCACAAGGCGGCGCTGGGATACACTACGGCGTAAGAGGCCGTCGGCCGCTA 

VGQMGARAGGGLSGVLRVPPR PYVMPHSPAAGO 

ATCGCCCCGCCGGCCTTGTCGCAGGACCGGTTCGCCGACTTCCCCGCGCTGCCCCTCGACCCGTCCGCGATGGTCGCCCAAGTGGGGCCACAGGTGGTCA 

(I , III 1 i 1 \ I i I I I ■ I ■ I I I3C 

TAGCGGGGCGGCCGGAACAGCGTCCTGGCCAAGCGGCTGAAGGGGCGCGACGGGGAGCTGGGCAGGCGCTACCAGCGGGTTCACCCCGGTGTCCACCAGT 

lAPPAL SGDRFAOFP A L P L D P S A H V A Q V G P 0 V V 

ACATCAACACCAAACTGGGCTACAACAACGCCGTGGGCGCCGGGACCGGCATCGTCATCGATCCCAACGGTGTCGTGCTGACCAACAACCACGTGATCGC 

-, I I I 1 1 I 1 — . I 1 1 1 ' 1 ' 1 ' 1 ' 1- 1 

TGTAGTTGTGGTTTGACCCGATGTTGTTGCGGCACCCGCGGCCCTGGCCGTAGCAGTAGCTAGGGTTGCCACAGCACGACTGGTTGTTGGTGCACTAGCG 

NINTKLGYNNAVGAGTGIVIOPNGVVLTNNHV lA 



GGGCGCCACCGACATCAATGCGTTCAGCGTCGGCTCCGGCCAAACCTACGGCGTCGATGTGGTCGGGTATGACCGCACCCAGGATGTCGCGGTGCTGCAG 

— 1 1 , 1 . 1 ' 1 ' 1 ' ' ' ' ' ' ' ' ' ' 

CCCGCGGT6GCTGTAGTTACGCAAGTCGCAGCCGAGGCCGGTTTGGATGCCGCAGCTACACCAGCCCATACTGGCGTGGGTCCTACAGCGCCACGACGTC 

G A T 0 1 N A F S V G S G 0 T Y G V D V V G Y D R T Q 0 V A V L 0 

CTGCGCGGTGCCGGTGGCCTGCCGTCGGCGGCGATCGGTGGCGGCGTCGCGGTTGGTGAGCCCGTCGTCGCGATGGGCAACAGCGGTGGGCAG6GCGGAA 

i t 1 . I I I ' " ' 1 ' ' ' ' ' ' ' ' ** *6 

GACGCGCCACGGCCACCGGACGGCAGCCGCCGCTAGCCACCGCC6CAGCGCCAACCACTCGGGCAGCAGCGCTACCCGTTGTCGCCACCCGTCCCGCCTT 

URGAGGLPSAA! GGGV AVGEPVVAMGMSGGQGG 



CGCCCCGTGCGGTGCCTGGCAGGGTGGTCGCGCTCGGCCAAACCGTGCAGGCGTCGGATTCGCT6ACCGGTGCCGAAGAGACATTGAACGGGTTGATCCA 

I I , I I I 1 ) ' ' ' ' ' ' ' t I I I ■ ■ I I I 17 
GCGGGGCACGCCACGGACCGTCCCACCA6CGCGAGCCGGTTTGGCACGTCCGCAGCCTAAGCGACTGGCCACGGCTTCTCTGTAACTTGCCCAACTAGGT 

TPRAVPGRVVALGQTVOASDSLTGAEE.TL NGL 10 
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GTTCGATGCCGCGATCCAGCCCGGTGATTCGGGCGGGCCCGTCGTCAA CCGCCTAGGACAGGTGGTCGGTATGAACACGGCCGCGTCCTAGGATATC 
CAAGCTACGGCGCTAGGTCGGGCCACTAAGCCCGCCCGGGCAGCAGTTGCCGGATCCTGTCCACCAGCCATACTTGTGCCGGCGCAGGATCCTATAG 

FOAAiOPcOSGGPVVNGLGOVVGMNTAAS .OI 
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ACCGGCTCCCTGGGGGCCCCCTTAAGCTGCTGCTGCTGTTCCTACGTGGACTGGGCGTAGTCGGCCTGTACTGCTTTCCGATAACGGGCCCACCGGCTAC 
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CGGTTTTGGCGACTTGGCCCTGTGCGACGGCGAGAAGTACCCCGACGGCTCCTTTTGGCACCAGTGGATGCAAACCTGGTTTACCGGCCCACAGTTTTAC 
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CCCAAAACCGCTGAACCGGCACACGCTGCCGCTCTTCATGGGGCTGCCGAGCAAAACCGTGGTCACCTACGTTTCCACCAAATGGCCGGGTGTCAAAATG 

GFGDLA VCOGEKYPOGSFWHOWMQTWFTCPOFY 
CVLATWPCATARSTPTARFGTSGCKRGLPAHSFT 
GFWRLGRVRRREVP R R L V L A P V 0 A N V V Y R P T V L . 

Dra III 
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pieAl yan91l pooHl 

TTCGATTGTGTCAGCGGCGGTGAGCCCCTCCCCGGCCCGCCGCCACCGGGTGGTT6CGGTGGGGCAATTCCGTCCGAGCAGCCCAACGCTCCCTGAGAAT 
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AACCTAACACAGTCGCCGCCACTCGGGGAGGGGCCGGGCGGCGGTGGCCCACCAACGCCACCCCGTTAAGGCAGGCTCGTCGGGTTCCGAGGGACTCTTA 

FOCVSGGEPLPGPPPPGGCGGAIPSEQPNAP. E 
SI VSAAVSPSPARRHRVVAVG O F RPSSPTLPEM 
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